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Abstract 

During  the  period  of  12/1/2004  -  5/31/2005,  we  have  proposed  different  approaches  on 
energy  efficient  wireless  sensor  networks. 

1.  We  proposed  an  event  forecasting  methodology  for  wireless  sensor  networks  using  interval 
type-2  fuzzy  logic  system,  which  consists  of  sensed  signal  strength  forecasting  and  event  de¬ 
tection.  We  also  studied  the  fundamental  performance  analysis  of  different  event  detection 
schemes. 

2.  We  studied  spectrum  efficient  coding  scheme  for  correlated  non-binary  sources  because 
there  exists  bandwidth  constraint  in  wireless  sensor  networks. 

3.  We  proposed  to  reduce  the  redundancy  in  wireless  sensor  networks  using  SVD-QR  method. 

4.  A  hybrid  approach  for  Asynchronous  Energy-Efficient  MAC  (ASCEMAC)  Protocol  was 
proposed  for  wireless  sensor  networks. 

5.  We  studied  energy-efficient  query  in  sensor  database  systems  with  uncertainties. 

6.  We  proposed  a  fuzzy  sensor  deployment  scheme,  and  studied  clustering  in  sensor  networks 
with  fuzzy  cluster  radius. 

7.  We  proposed  a  cross-layer  (physical  layer,  data-link  layer  and  application  layer)  design 
scheme  for  mobile  ad  hoc  networks. 

Eleven  papers  were  produced  during  the  past  six  months,  and  are  attached  to  this  report. 


20050622  010 


1 


1  Event  Forecasting  for  Wireless  Sensor  Networks  Using  Interval 
Type-2  Fuzzy  Logic  System 

Wireless  sensor  networks  (WSN)  are  often  used  to  perform  event  detection,  tracking,  and  classifi¬ 
cation.  Therefore,  compared  to  ad-hoc  networks,  WSN  should  be  event-centric.  In  [1],  we  proposed 
an  event  forecasting  scheme  for  wireless  sensor  networks  using  interval  type-2  fuzzy  logic  system. 
Our  event  forecasting  scheme  consists  of  two  steps:  sensed  signal  strength  forecasting  and  event 
detection.  We  demonstrated  that  real-world  sensed  acoustic  signals  are  self-similar,  which  means 
they  are  forecastable.  We  showed  that  a  type-2  fuzzy  memebership  function  (MF) ,  i.  e. ,  a  Gaussian 
MF  with  uncertain  mean  is  appropriate  to  model  the  sensed  signal  strength  of  wireless  sensors. 
Two  fuzzy  logic  systems  (FLS),  a  type-1  FLS  and  an  interval  type-2  FLS  were  designed  for  signal 
strength  forecasting.  Furthermore,  we  proposed  a  double  sliding  window  scheme  for  event  detection 
based  on  the  forecasted  signals.  Simulation  results  show  that  the  interval  type-2  FLS  outperforms 
the  type-1  FLS  in  signal  strength  forcating  and  the  performance  of  event  detection  based  on  the 
forecasted  signal  from  type-2  FLS  is  much  better  than  that  based  on  type-1  FLS. 

2  Event  Detection  Algorithm  and  Fundamental  Performance  Anal¬ 
ysis  in  Wireless  Sensor  Networks 

In  [3],  we  presented  two  methods  to  do  event  detection,  one  is  Double  Sliding  Window  Detection, 
the  other  one  is  Fuzzy  Logic  approach.  The  accuracy  of  the  results  is  established  via  sensor 
network  testbed  and  simulations.  In  [5]  [6],  we  presented  a  fundamental  performance  analysis  of 
event  detection  in  wireless  sensor  networks.  We  compared  double  sliding  window  theoretically 
against  the  fixed  threshold  approach.  In  [5],  Rayleigh  and  Rician  distributions  are  validated  for 
the  sensed  signals  and  used  in  the  performance  analysis;  and  in  [6],  Gaussian  distribution  with 
uniformly  distributed  mean  values  are  assumed  for  the  analysis.  Measures  of  performance  for  these 
tasks  are  well  defined,  including  detection  of  false  alarms  or  misses,  classification  errors,  and  track 
quality. 

3  Spectrum  Efficient  Coding  Scheme  for  Correlated  Non-Binary 
Sources  in  Wireless  Sensor  Networks 

Energy-aware  technique  to  reduce  energy  consumption  in  distributed  sensor  networks  has  become  a 
prominent  topic  in  sensor  network  research.  Various  sensor  network  applications  have  taken  energy 
efficiency  into  consideration.  In  the  case  of  correlated  binary  sources,  distributed  source  coding  has 
been  literally  studied  in  information  theory.  However,  data  sources  from  real  sensor  networks  are 
normally  non-binary.  In  [4],  we  proposed  a  spectrum  efficient  coding  scheme  for  correlated  non- 
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binary  sources  in  sensor  networks.  Our  approach  constructs  the  codeword  cosets  for  the  interested 
source,  taking  advantage  of  statistical  characters  of  the  distinct  observations  from  sensor  nodes. 
The  coset  leaders  are  then  transmitted  via  the  channel  and  decoding  is  performed  with  the  available 
side  information.  Simulations  were  carried  out  over  independent  and  identically  distributed  (i.i.d) 
Gaussian  sources  and  data  collected  from  Xbow  wireless  sensor  network  test  bed.  Simulation  results 
show  that  the  proposed  scheme  performs  at  0.5  -  1.5  dB  from  the  Wyner-Ziv  distortion  bound. 

4  Redundancy  Reduction  in  Wireless  Sensor  Networks  Using  SVD- 
QR 

In  densely  deployed  wireless  sensor  networks,  not  only  does  the  data  of  one  sensor  node  have  self¬ 
similarity,  but  the  data  from  adjacent  sensor  nodes  also  have  cross-similarity.  Therefore,  it  is  clear 
that  there  exists  highly  redundancy  in  the  collected  data  from  sensor  nodes  in  the  neighborhood. 
Due  to  the  intrinsic  properties  wireless  sensor  networks  have,  e.g.,  energy  constraint,  bandwidth 
limitation,  this  kind  of  information  redundancy  will  impact  the  whole  networks  in  a  negative  way. 
In  [2] ,  we  proposed  to  use  Singular- Value-QR  Decomposition  (SVD-QR)  to  reduce  the  redundancy 
in  wireless  sensor  networks. 

5  A  Hybrid  Approach  for  Asynchronous  Energy-Efficient  MAC 
(ASCEMAC)  Protocol  for  Wireless  Sensor  Networks 

In  [7],  a  novel  asynchronous  energy-efficient  MAC  protocol,  ASCEMAC,  was  proposed  for  wireless 
sensor  networks.  We  combined  both  contention-based  and  schedule-based  MAC  protocols’  energy 
saving  strategies  in  our  algorithm.  In  ASCEMAC,  by  applying  free-running  method  and  fuzzy 
logic  rescheduling  scheme,  time  synchronization  which  is  necessary  in  existing  energy-efficient  MAC 
protocols  is  not  required  any  more.  Moreover,  we  presented  a  traffic-intensity  and  network-density- 
based  model  to  determine  essential  algorithm  parameters,  such  as  power  on/off  duration,  interval  of 
schedule  broadcast  and  super-time-slot  size  and  order.  Simulation  results  showed  that  our  algorithm 
ensures  the  average  successful  transmission  rate,  decreases  the  data  packet  average  waiting  time, 
and  reduces  the  average  energy  consumption.  Therefore,  network  performance  is  improved  and 
network  lifetime  is  extended  by  using  our  algorithm. 

6  Energy-Efficient  Query  in  Sensor  Database  Systems  with  Un¬ 
certainties 

Query  processing  methods  have  been  studied  extensively  in  the  context  of  database  systems.  But 
they  are  not  directly  applicable  in  sensor  database  systems  due  to  the  characteristics  of  sensor  net- 
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works:  the  decentralized  nature  of  sensor  networks,  the  limited  computational  power  and  energy- 
scarcity  of  individual  sensor  node,  and  imperfect  information  recorded.  In  [8],  we  proposed  an 
energy-efficient  query  optimization  algorithm  (QOA)  for  imperfect  information  in  sensor  database 
systems.  We  employed  an  in-network  query  processing  method,  which  tasks  sensor  networks 
through  declarative  queries.  Given  a  query,  our  QOA  generates  an  energy  efficient  query  plan 
for  in-network  query  processing.  Moreover,  our  algorithm  can  explicitly  exposes  uncertainty  and 
ambiguity  of  query  results  to  database  users.  As  we  know,  it  is  troublesome  or  even  impossible  to 
keep  a  large  number  of  data  in  sensor  database  systems  for  network  resource  constraints.  In  our 
algorithm,  we  formulated  the  probability  distribution  functions  (PDFs)  of  measurement  uncertain¬ 
ties  according  to  the  knowledge  on  observation  coverage  and  devices  utilized,  instead  of  estimating 
them  from  prior  data.  The  simulation  results  demonstrated  that  our  algorithm  can  vastly  reduce 
resource  usage  and  thus  extend  the  lifetime  of  sensor  database  system. 

7  Fuzzy  Deployment  for  Wireless  Sensor  Networks 

In  [9],  we  developped  a  fuzzy  deployment  for  wireless  sensor  networks.  Traditional  deployments 
often  assume  a  homogeneous  environment,  which  ignores  the  effect  of  terrain  profile  and  obstacles 
such  as  buildings,  trees  and  so  on.  Nevertheless,  in  many  applications,  some  areas  need  to  be  more 
critically  monitored.  All  these  factors  are  combined  together  through  Fuzzy  Logic  System  in  our 
proposed  scheme.  Simulation  results  show  that  the  Fuzzy  Deployment  improves  the  worst-case 
coverage  by  around  5  dB. 

8  Clustering  in  Sensor  Networks  with  Fuzzy  Cluster  Radius 

Previous  research  shows  that  restraining  cluster  size  helps  energy  efficiency  in  sensor  networks. 
However,  it  is  often  ignored  that  the  distance  estimation  in  sensor  networks  is  inaccurate  enough 
for  fine-grained  clustering  decision.  In  [10],  we  were  concerned  with  developing  a  fuzzy  cluster  size  to 
handle  the  distance  error  and  non-linearity.  A  fuzzy  logic  system  was  developed  to  make  clustering 
decision  based  on  the  received  signal  strength.  Simulation  results  showed  that  the  proposed  Fuzzy 
Cluster  Size  scheme  can  keep  the  performance  near  the  optimal  range  when  distance  estimation  is 
distorted  by  log-normal  shadowing. 

9  Bottom-up  Cross-Layer  Optimization  for  Mobile  Ad  Hoc  Net¬ 
works 

In  [11],  we  introduced  a  cross-layer  design  method  for  mobile  ad  hoc  networks.  We  use  fuzzy 
logic  system  (FLS)  to  coordinate  physical  layer,  data-link  layer  and  application  layer  for  cross¬ 
layer  design.  Ground  speed,  average  delay  and  packets  successful  transmission  ratio  are  selected 
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as  antecedents  for  the  FLS.  The  output  of  FLS  provides  adjusting  factors  for  the  AMC  (Adaptive 
Modulation  and  Coding),  transmission  power,  retransmission  times  and  rate  control  decision.  Simu¬ 
lation  results  show  that  our  cross-layer  design  can  reduce  the  average  delay,  increase  the  throughput 
and  extend  the  network  lifetime.  The  network  performance  parameters  could  also  keep  stable  after 
the  cross-layer  optimization. 
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Abstract 

Wireless  sensor  networks  (WSN)  are  often  used  to  perform  event  detection,  tracking, 
and  classification.  Therefore,  compared  to  ad-hoc  networks,  WSN  should  be  event-centric. 
In  this  paper,  we  propose  an  event  forecasting  scheme  for  wireless  sensor  networks  using 
interval  type-2  fuzzy  logic  system.  Our  event  forecasting  scheme  consists  of  two  steps: 
sensed  signal  strength  forecasting  and  event  detection.  We  demonstrate  that  real-world 
sensed  acoustic  signals  are  self-similar,  which  means  they  are  forecastable.  We  showed  that 
a  type-2  fuzzy  memebership  function  (MF),  i.e.,  a  Gaussian  MF  with  uncertain  mean  is 
appropriate  to  model  the  sensed  signal  strength  of  wireless  sensors.  Two  fuzzy  logic  systems 
(FLS),  a  type-1  FLS  and  an  interval  type-2  FLS  are  designed  for  signal  strength  forecasting. 
Furthermore,  we  propose  a  double  sliding  window  scheme  for  event  detection  based  on  the 
forecasted  signals.  Simulation  results  show  that  the  interval  type-2  FLS  outperforms  the 
type-1  FLS  in  signal  strength  forcating  and  the  performance  of  event  detection  based  on 
the  forecasted  signal  from  type-2  FLS  is  much  better  than  that  based  on  type-1  FLS. 

Index  Terms  :  Wireless  sensor  networks,  fuzzy  logic  systems,  interval  type-2  membership 
function,  self-similarity,  forecasting,  event  detection. 
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1  Introduction 


Wireless  sensor  networking  is  an  emerging  technology  that  promises  unprecedented  ability 
to  monitor  and  manipulate  the  physical  world  via  a  network  of  densely  distributed  wireless 
sensor  nodes.  The  nodes  can  sense  the  physical  environment  in  a  variety  of  modalities,  in¬ 
cluding  acoustic,  seismic,  thermal,  and  infrared.  They  are  networked  together  in  an  ad  hoc 
fashion,  which  involves  peer-to-peer  communication  in  a  network  with  a  dynamically  changing 
topology.  Wireless  sensor  networks  do  not  rely  on  a  preexisting  fixed  infrastructure,  such  as 
a  wireline  backbone  network  or  a  base  station.  They  are  self-organizing  entities  that  are  de¬ 
ployed  on  demand  in  support  of  various  events  such  as  security  and  surveillance,  monitoring  of 
wildlife  habitats,  smart  sensor-instrumented  environments,  and  condition-based  maintenance 
of  complex  systems,  etc. 

Sensor  nodes  are  typically  powered  by  small  batteries  that  are  hard  to  replace  or  recharge. 
Hence,  energy  constraint  is  a  unique  character  of  WSN  compared  with  traditional  wireless  ad- 
hoc  networks.  Energy  comsuption  occurs  in  three  domains:  sensing,  data  processing  (including 
AD/DA  and  digital  signal  processing),  and  communications [8].  According  to  [1],  the  sensor, 
signal  processing  parts  operate  at  low  frequency  and  consume  less  than  1  mW.  This  is  over  an 
order  of  magnitude  less  than  the  energy  consumption  of  the  communication  part.  Therefore, 
we  prefer  less  communication/data  exchange  between  sensor  nodes  but  more  local  processing 
implemented  by  one  single  sensor  node  so  as  to  increase  the  lifetime  of  the  WSN. 

The  main  goal  of  WSN  is  to  monitor  physical  world.  Usually,  people  are  more  intereted 
in  unexpected  events.  For  example,  in  a  scenario  of  battlefield,  people  are  more  interested 
in  the  appearance  of  enemies.  If  a  WSN  is  to  monitor  forest-fire,  unusual  increasing  of  the 
temperature  should  be  a  necessary  warning  to  people.  Both  the  appearance  of  enemies  and  the 
unusual  increasing  of  the  temperature  can  be  seen  as  events.  Because  of  the  energy  constraint 
of  WSN  mentioned  previously,  the  ideal  state  of  WSN  should  be  event-driven,  so  that  we  can 
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power  off  the  communication  part  at  most  of  the  time.  Only  when  certain  sensor  nodes  detect 
an  event,  they  trigger  the  RF  channel,  and  transmit  the  useful  information  to  clusterhead  or 
gateway.  This  power  on/off  management  will  be  easier  if  each  wireless  sensor  could  forecast 
its  future  sensed  signal  strength  and  make  event  detection. 

In  this  paper,  we  propose  an  event  forecasting  scheme  for  wireless  sensor  networks  using 
interval  type-2  fuzzy  logic  system.  Our  event  forecasting  scheme  consists  of  two  steps:  sensed 
signal  strength  forecasting  and  event  detection.  We  use  Xbow  wireless  sensor  network  profes¬ 
sional  develper’s  kit  MOTE-Kit[7]  as  our  testbed  to  get  data  sets  from  different  scenarios.  First 
of  all,  we  show  that  the  sensed  signal  strength  is  self-similar  and  long-range  dependent  using 
variance-time  plotting  ,  a  common  statistical  method  which  has  been  widely  used  to  verify 
self-similarity  of  time-series.  Since  the  sensed  signal  strength  is  self-similar,  its  characteristics 
can  be  captured.  We  apply  a  type-1  FLS  and  an  interval  type-2  FLS  to  sensed  signal  strength 
forecasting.  Furthermore,  we  make  event  detection  based  on  the  forecasted  signal. 

The  remainder  of  the  paper  is  organized  as  follows.  Section  2  studies  the  self-similarity  of 
sensed  signal  strength.  Section  3  gives  an  overview  of  type-2  fuzzy  sets  and  interval  type-2 
FLSs.  In  Section  4,  we  demonstrate  that  sensed  signal  strength  in  WSN  should  be  modeled 
as  a  type-2  MF,  a  Gaussian  MF  with  uncertain  mean.  Hence,  we  apply  this  knowledge  and 
design  an  interval  type-2  FLS  to  forecast  the  sensed  signal  strength  in  WSN.  A  singleton  type- 

1  FLS  is  also  designed  for  performance  comparison.  In  Section  5,  we  propose  double  sliding 
window  scheme  to  make  event  detection  based  on  the  forecasted  signal.  Simulation  results  and 
discussions  are  presented  in  section  6.  Section  7  concludes  this  paper. 

2  Self-Similarity  of  Sensed  Signal  Strength  in  WSN 

For  a  detailed  discussion  on  self-similarity  in  time-series,  see  [17]  [16].  Here  we  briefly  present 
its  definition  [2].  Given  a  zero-mean,  stationary  time-series  X  =  ( Xt ;  t  =  1, 2, 3,  •  •  • ),  we  define 
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the  m-aggregated  series  =  (X^ ;  k  =  1,2,3,  by  summing  the  original  series  X  over 
nonoverlapping  blocks  of  size  m.  Then  it’s  said  that  X  is  if -self-similar,  if,  for  all  positive  m, 
X has  the  same  distribution  as  X  rescaled  by  mH .  That  is, 

tm 

Xt  =  m~H  X*  VmeiV.  (1) 

i=(t— l)m+l 

If  X  is  if-self-similar,  it  has  the  same  autocorrelation  function  r(k)  —  E[(Xt — n)(Xt+f:  —  ji)} / a2 
as  the  series  X^>  for  all  m,  which  means  that  the  series  is  distributionally  self-similar:  the 
distribution  of  the  aggregated  series  is  the  same  as  that  of  the  original. 

Self-similar  processes  can  show  long-range  dependence.  A  process  with  long-range  depen¬ 
dence  has  an  autocorrelation  function  r(k )  ~  k~&  as  k  — >  oo,  where  0  <  /3  <  1 .  The  degree 
of  self-similarity  can  be  expressed  using  Hurst  parameter  H  =  1  —  (3/2.  For  self-similar  series 
with  long-range  dependence,  1/2  <  H  <  1.  Asif— >1,  the  degree  of  both  self-similarity  and 
long-range  depence  increases. 

One  method  that  has  been  widely  used  to  verify  self-similarity  is  the  variance-time  plot, 
which  relies  on  the  slowly  decaying  variance  of  a  self-similar  series.  The  variance  of  X is 
plotted  against  m  on  a  log- log  plot,  and  a  straight  line  with  slope  {—(3)  greater  than  —1  is 
indicative  of  self-similarity,  and  the  parameter  H  is  given  by  H  =  1  —  (3/2.  We  use  this  method 
to  verify  the  self-similarity  of  acoustic  signal. 

In  our  experiments,  8  sensors  were  deployed  in  a  lab.  The  location  of  each  sensor  is  plotted 
in  Fig.  1.  We  designed  two  scenarios,  one  is  with  a  fixed  source,  and  the  other  is  without  a 
fixed  source.  In  Fig.  2,  we  plot  the  variance  of  X ^  against  m  on  a  log-log  plot  for  8  sensor 
node  data  respectively  in  the  first  scenario  and  Fig.  3  is  under  the  second  scenario.  From  the 
two  figures,  it’s  very  clear  that  the  no  matter  under  what  kind  of  condition  the  sensor  network 
data  have  self-similarity  because  their  traces  have  slopes  much  greater  than  —1. 
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3  Introduction  of  Type-2  Fuzzy  Set  and  Interval  Type-2  Fuzzy 
Logic  Systems 

3.1  Introduction  to  Type-2  Fuzzy  Set 

The  concept  of  type-2  fuzzy  sets  was  introduced  by  Zadeh  [18]  as  an  extension  of  the  concept  of 
an  ordinary  fuzzy  set,  i.e.,  a  type-1  fuzzy  set.  Type-2  fuzzy  sets  have  grades  of  membership  that 
are  themselves  fuzzy  [3].  A  type-2  membership  grade  can  be  any  subset  in  [0, 1]  -  the  primary 
membership ;  and,  corresponding  to  each  primary  membership,  there  is  a  secondary  membership 
(which  can  also  be  in  [0, 1])  that  defines  the  possibilities  for  the  primary  membership.  A  type-1 
fuzzy  set  is  a  special  case  of  a  type-2  fuzzy  set;  its  secondary  membership  function  is  a  subset 
with  only  one  element,  unity.  Type-2  fuzzy  sets  allow  us  to  handle  linguistic  uncertainties,  as 
typified  by  the  adage  “words  can  mean  different  things  to  different  people.”  A  fuzzy  relation  of 
higher  type  (e.g.,  type-2)  has  been  regarded  as  one  way  to  increase  the  fuzziness  of  a  relation, 
and,  according  to  Hisdal,  “increased  fuzziness  in  a  description  means  increased  ability  to  handle 
inexact  information  in  a  logically  correct  manner  [5]”. 

Figure  4  shows  an  example  of  a  type-2  set.  The  domain  of  the  membership  grade  corre¬ 
sponding  to  x  =  4  is  also  shown.  The  membership  grade  for  every  point  is  a  Gaussian  type-1 
set  contained  in  [0, 1],  we  call  such  a  set  a  “Gaussian  type-2  set”.  When  the  membership  grade 
for  every  point  is  a  crisp  set,  the  domain  of  which  is  an  interval  contained  in  [0, 1],  we  call  such 
type-2  sets  “interval  type-2  sets”  and  their  membership  grades  “interval  type-1  sets”.  Interval 
type-2  sets  are  very  useful  when  we  have  no  other  knowledge  about  secondary  memberships. 
An  interval  type-2  MF  is  characterized  by  an  upper  and  lower  MF  [10].  An  upper  MF  and  a 
lower  MF  are  two  type-1  MFs  which  are  bounds  for  the  footprint  of  uncertainty  of  an  inter¬ 
val  type-2  MF.  The  upper  MF  is  a  subset  which  has  the  maximum  membership  grade  of  the 
footprint  of  uncertainty;  and,  the  lower  MF  is  a  subset  which  has  the  minimum  membership 
grade  of  the  footprint  of  uncertainty. 
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Example  1:  Gaussian  Primary  MF  with  Uncertain  Mean 

Consider  the  case  of  a  Gaussian  primary  MF  having  a  fixed  standard  deviation,  olk,  and  an 
uncertain  mean  that  takes  on  values  in  i.e., 


Hlk(xk)  =  exp  ;mfc)2l  ,  rn[  G  [mlkl,mlk2] 

l  2  al  J 


(2) 


Tl 

Tk 

where:  k  =  1, . . .  ,p;  p  is  the  number  of  antecedents;  l  =  1, . . . ,  M;  and,  M  is  the  number  of 
rules.  The  upper  MF,  ~p,k(xk),  is  (see  Fig.  5) 


A(xk) 


N(mlkl,alk,xk) ,  xk<mlkl 

1 ,  mlkl  <  xk  <  m[2 

N{mlk2,alk,xk),  xk>mlk2 

where,  for  example,  J\f(mlkValk\xk )  =  exp  )2^ , 

The  lower  MF,  fJ,Uxk ),  is  (see  Fig.  5) 


Mfc(zfc)  = 


N{mlk2,olk\xk) ,  xk<7-^ 


mL  +m! 


hX 


>  <4;  xk)  »  xk  > 


(3) 


(4) 


3.2  Introduction  to  Type-2  FLS 

Figure  6  shows  the  structure  of  a  type-2  FLS  [14].  It  is  very  similar  to  the  structure  of  a 
type-1  FLS  [11].  For  a  type-1  FLS,  the  output  processing  block  only  contains  the  defuzzifier. 
We  assume  that  the  reader  is  familiar  with  type-1  FLSs,  so  that  here  we  focus  only  on  the 
similarities  and  differences  between  the  two  FLSs. 

The  fuzzifier  maps  the  crisp  input  into  a  fuzzy  set.  This  fuzzy  set  can,  in  general,  be  a 
type-2  set. 

In  the  type-1  case,  we  generally  have  “IF-THEN”  rules,  where  the  Zth  rule  has  the  form 
“Rl  :  IF  xi  is  Fj  and  22  is  F*2  and  •  •  •  and  xp  is  Fj,,  THEN  y  is  G;”,  where:  xts  are  inputs;  F(s 
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are  antecedent  sets  (i  =  1, . . .  ,p);  y  is  the  output;  and  G;s  are  consequent  sets.  The  distinction 
between  type-1  and  type-2  is  associated  with  the  nature  of  the  membership  functions,  which  is 
not  important  while  forming  rules;  hence,  the  structure  of  the  rules  remains  exactly  the  same 
in  the  type-2  case,  the  only  difference  being  that  now  some  or  all  of  the  sets  involved  are  of 
type-2;  so,  the  Ith.  rule  in  a  type-2  FLS  has  the  form  “ Rl  :  IF  x\  is  Fj  and  is  F2  and  •  ■  ■ 
and  xp  is  fJ,,  THEN  y  is  G*”. 

In  the  type-2  case,  the  inference  process  is  very  similar  to  that  in  type-1.  The  inference 
engine  combines  rules  and  gives  a  mapping  from  input  type-2  fuzzy  sets  to  output  type-2 
fuzzy  sets.  To  do  this,  one  needs  to  find  unions  and  intersections  of  type-2  sets,  as  well  as 
compositions  of  type-2  relations. 

In  a  type-1  FLS,  the  defuzzifier  produces  a  crisp  output  from  the  fuzzy  set  that  is  the 
output  of  the  inference  engine,  i.e.,  a  type-0  (crisp)  output  is  obtained  from  a  type-1  set.  In 
the  type-2  case,  the  output  of  the  inference  engine  is  a  type-2  set;  so,  “extended  versions”  (using 
Zadeh’s  Extension  Principle  [18])  of  type-1  defuzzification  methods  were  proposed  in[14] .  The 
type-reduction  gives  a  type-1  fuzzy  set  called  “type- reduction  set”. 

To  obtain  a  crisp  output  from  a  type-2  FLS,  we  can  defuzzify  the  type-reduced  set.  The 
most  natural  way  of  doing  this  seems  to  be  by  finding  the  centroid  of  the  type-reduced  set; 
however,  there  exist  other  possibilities  like  choosing  the  highest  membership  point  in  the  type- 
reduced  set. 

General  type-2  FLSs  are  computationally  intensive,  because  type-reduction  is  very  inten¬ 
sive.  Things  simplify  a  lot  when  secondary  membership  functions  (MFs)  are  interval  sets  (in 
this  case,  the  secondary  memberships  are  either  0  or  1).  When  the  secondary  MFs  are  interval 
sets,  the  type-2  FLSs  are  called  “interval  type-2  FLSs”.  In  [10],  Liang  and  Mendel  proposed  the 
theory  and  design  of  interval  type-2  FLSs.  They  proposed  an  efficient  and  simplified  method 
to  compute  the  input  and  antecedent  operations  for  interval  type-2  FLSs,  one  that  is  based  on 
a  general  inference  formula  for  them. 
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In  an  interval  type-2  nonsingleton  FLS  with  type-2  fuzzification  and  meet  under  minimum 
or  product  f-norm.  the  result  of  the  input  and  antecedent  operations,  F*,  is  an  interval  type-1 
set,  i.e.,  Fl  —  {f,  /},  where  fl  and  /  simplify  to 


/*  =  /fp,(*l  )*•••*/*  F1  (xp)  (5) 

r  1  r  p 

and 

7*  =  (*i)  *  •  •  •  *  Tv  (x*>)  (6) 

r  i  rP 


where  Xi  (i  =  1, . . .  ,p)  denotes  the  location  of  the  singleton.  In  this  paper,  we  use  center-of-sets 
type-reduction  [10],  which  can  be  expressed  as: 


tiY1,---  ,FM)  =  [yuyr\=  f  ...  f  f  ...  f 

J y*  J yM  J Z1  J 


fV 


(7) 


where  Fcos  is  an  interval  set  determined  by  two  end  points,  yi  and  yr]  p  6  Fl  =  [/\  /*] ; 
yl  £  Yl  =  [yi,yl\,  and  Yl  is  the  centroid  of  the  type-2  interval  consequent  set  G\  and, 
i  —  1,...,M.  We  also  use  the  training  method  proposed  in  [10]  for  designing  an  interval 
type-2  FLS  in  which  its  parameters  are  tuned 


4  Sensed  Singal  Strength  Forecasting  Using  Interval  Type-2 
FLS 


Acoustic  amplitude  sensor  node  measures  sound  amplitude  at  its  microphone.  Assuming  that 
the  sound  source  is  a  point  source  and  sound  propagation  is  lossless  and  isotropic,  a  root- 
mean-squared  (RMS)  amplitude  measurement  z  is  related  to  the  sound  source  position  X 
as 


z  = 


+  w, 


(8) 


II*  -4 

where  a  is  the  RMS  amplitude  of  the  sound  source,  q  is  the  location  of  the  sensor,  and  w  is 
RMS  measurement  noise  [9].  According  to  [9],  w  is  modelled  as  a  Gaussian  with  zero  mean 
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and  variance  <r2.  The  sound  source  amplitude  a  is  also  modelled  as  a  random  quantity,  which 
is  uniformly  distributed  in  the  interval  [ai0,  a  hi}.  Given  the  location  of  the  sound  source  X  and 
the  sensor  position  <;,  ||^g||  is  uniformly  distributed  as  a  is.  Therefore,  z  should  be  modelled 
as  a  Gaussian  primary  MF  having  a  fixed  standard  deviation  and  an  uncertain  mean,  as  shown 
in  Fig  5. 

FLSs  have  been  extensively  used  in  time-series  forecasting  (e.g.,  [12],  [10]).  Since  the  sensed 
signal  strength  in  WSN  is  self-similar  as  demonstrated  in  Section  2,  its  characteristics  can  be 
captured,  which  also  means  it  can  be  forecasted.  Here  we  apply  an  interval  Type-2  FLS  to 
do  a  multi-step  forecasting,  the  step  size  is  L.  We  use  four  antecedents,  i.e.,  x(k  —  4  x  L), 
x(k  —  3  x  L),  x(k  —  2  x  L ),  and  x(k  —  lx  L),  as  inputs  of  the  FLS  to  predict  x(k).  Similarly, 
we  use  x(k  —  4  x  L  +  i),  x(k  —  3  xL  +  i),  x(k  —  2  x  L  +  i),  and  x(k  —  1  x  L  +  i)  to  predict 
x(k  +  i),  Mi  <  L.  If  antecedent  has  two  fuzzy  sets,  the  number  of  rules  is  24  =  16.  The  rules 
are  set  up  as  one  example  shown  bellow: 

Rl  :  IF  x(k  —  4  x  L)  is  F*x  and  x(k  —  3  x  L)  is  F2  and  x(k  —  2  x  L)  is  F^  and  x(k  —  1  x  L)  is 

Fz2,  THEN  x{k)  is  &. 

We  use  center-of-sets  type  reduction  and  steepest  descent  training  algorithm  [10]  to  design  this 
interval  type-2  FLS. 

For  comparison,  we  also  design  a  type-1  FLS  for  signal  strength  forecasting.  Antecedents 
are  the  same  as  in  the  interval  type-2  FLS,  however  Gaussian  MFs  are  chosen  for  this  type-1 
FLS.  There  are  also  16  rules,  since  each  of  the  antecedents  has  2  fuzzy  sub-set  as  well.  The 
rule  is  designed  as: 

Rl  :  IF  x(k  —  4  x  L)  is  F^  and  x(k  —  3  x  L)  is  F2  and  x(k  —  2  x  L)  is  F^  and  x{k  —  1  x  L)  is 

F2,  THEN  x(k)  is  Gl. 

We  use  center-of-sets  defuzzifier  and  steepest  descent  training  algorithm  to  design  this  type-1 
FLS. 
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Our  event  forecasting  scheme  consists  of  two  steps:  sensed  signal  strength  forecasting  and 
event  detection.  In  this  paper,  we  propose  a  new  event  detection  algorithm,  double  sliding 
window  event  detection. 

5  Double  Sliding  Window  Event-Detection 

In  [15],  the  acoustic  energy  in  a  fixed  period  of  time  is  integrated,  when  it  exceeds  a  threshold, 
an  event  is  claimed  occurring,  as: 


M—l 

Es  =  ^  '  \zn— to]  i 

771=0 

(9) 

Bs  ^  Ethreshold- 

(10) 

However,  this  simple  method  suffers  from  a  significant  drawback;  namely,  the  value  of  the 
threshold  depends  on  the  sensed  signal  energy.  When  there  is  no  event  occuring  in  the  sensing 
range,  the  sensed  signal  consists  of  only  noise.  The  level  of  the  noise  power  is  generally 
unknown  and  can  change  when  the  environment  changes  or  if  unwanted  interferers  go  on  and 
off.  Therefore,  it  is  quite  difficult  to  set  a  fixed  threshold.  We  propose  a  double  sliding  window 
algorithm  for  event-detection  so  as  to  alleviate  the  threshold  value  selection  problem. 

The  double  sliding  window  event-detection  algorithm  calculates  two  consecutive  sliding 
windows  of  the  sensed  signal  energy.  The  basic  principle  is  to  form  the  decision  variable  as 
the  ratio  of  the  total  energy  contained  inside  the  two  windows.  Figure  7  shows  the  windows 
A  and  B  and  the  response  of  the  ratio  mn  to  the  start  and  end  of  a  sensed  event.  It  can  be 
seen  that  when  only  noise  is  sensed  the  response  is  flat,  since  both  windows  contain  ideally  the 
same  amount  of  noise  energy. 

The  calculation  of  the  window  A  and  window  B  value  is  represented  as 

M- 1 

Ea=J2\zn-m\2,  (11) 

771=0 
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(12) 


Then  the  decision  variable  mn  is 


M—l 


Eb=  ^kn+d2- 
1=0 


mn 


Ea 

Eb 


(13) 


When  mn  exceeds  the  threshold  Th\ ,  an  event  is  claimed  occurring(see  Fig.  7.(a)).  The 
advantages  of  this  approach  are:  1st,  the  decision  variable  mn  does  not  depend  on  the  sensed 
signal  energy,  but  on  the  ratio  of  the  energy  of  two  consecutive  windows;  2nd,  we  can  predict 
not  only  the  starting  edge  of  the  event,  but  also  the  ending  edge,  i.e.  ,  when  Mn  below  the 
threshold  T/12,  the  event  is  claim  ending(see  Fig.  7.(b)). 


6  Simulations 

Our  simulations  were  based  on  N  =  480  samples,  z(l),  a:  (2),  . . .,  2; (480).  The  first  240  data, 
a:(l),  x(2),  . . .,  x(240),  are  for  training,  and  the  remaining  240  data,  x(241),a:(242), . . .  ,x(480) 
are  for  testing.  In  Fig.  8,  we  plot  the  sensed  data  that  we  used  for  training  and  testing,  x(l), 
a; (2),  a; (480).  A  standard  1  kHz  audio  signal  with  different  volume  levels  was  used  to 
simulate  the  events.  Each  sample  has  1024ms  duration. 

We  applied  a  type-1  FLS  and  an  interval  type-2  FLS  for  sensed  signal  forecasting.  The 
initial  locations  of  antecedent  MFs  were  based  on  the  mean,  mt,  and  std,  at,  of  the  training 
data  set.  The  parameters  and  number  of  parameters  in  the  type-1  FLS  and  interval  type-2 
FLS  are  summarized  in  Table  1.  The  initial  values  we  choose  for  the  Guassian  MFs  are  listed 
in  Table  2.  Then,  we  use  steepest  descent  algorithm  to  train  all  the  parameters  based  on  the 
training  data.  After  training,  all  the  parameters  and  rules  are  fixed  and  we  test  the  interval 
type-2  FLS  based  on  the  remaining  240  samples,  x(241),  x(242),  . . .,  x(480).  We  set  the  step 
size  as  L  =  5  in  both  the  type-1  FLS  and  the  interval  type-2  FLS.  Meanwhile,  the  window  size 
M  equals  to  5  in  double  sliding  window  event-detection  as  well.  That  makes  the  sensed  signal 
forecating  meaningful. 
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We  compared  the  performance  of  the  interval  type-2  FLS  with  that  of  the  type-1  FLS 
for  sensed  signal  strength  forecasting.  For  each  FLS,  we  ran  100  Monte-Carlo  realizations  to 
eliminate  the  randomness  of  the  consequences,  and  the  two  FLSs  were  tuned  using  a  simple 
steepest-descent  algorithm  for  5  epochs.  We  used  the  testing  data  to  see  how  each  FLS  per¬ 
formed  by  evaluating  the  root-mean-square-error  (RMSE)  between  the  defuzzified  output  of 
the  FLS  and  the  actual  sensor  data  ( x(k  +  1)),  i.e ., 


RMSE  = 


1 


1  480 

—  y  [*(*o  -  /(xfc)]2> 


(14) 


fc=241 


where  xfe  =  [x(k  —  4  x  5),  x(k  —  3  x  5),  x(k  —  2  x  5),x(k  —  1  x  5)]r,  and  T  denotes  transpose.  The 
RMSE  of  all  simulations  are  summarized  in  Figure  9.  Observe  Figure  9,  the  interval  type-2 
FLS  outperforms  the  type-1  FLS  in  the  sensed  signal  strength  forecasting. 

We  are  more  interested  in  the  system’s  capability  of  forecasting  the  events,  especially  the 
starting  point  of  the  events.  We  used  the  forecasted  data  sets  to  detect  the  starting  point 
of  the  events,  i.e.,  the  time  stamp  of  event  occurrence  and  then  compared  with  the  actual 
time  stamp.  We  evaluated  our  double  sliding  window  algorithm  and  compared  it  against  the 
cumulated  signal  strength  scheme[15].  We  chose  Th\  —  mean  +  std  for  the  double  sliding 
window  event  detection.  Since  the  threshold  is  hard  to  choose  for  cumulated  signal  strength 
scheme,  we  ran  simulation  for  3  different  thresholds:  i.e.,  mean,  mean  +  std/2  and  mean  +  std. 
We  also  ran  100  Monte-Carlo  simulations  so  as  to  get  the  average  absolute  error  between  the 
forecasted  and  actual  time  stamp,  \Di  —  Pj|,  where  Di  is  the  detected  starting  point 

(based  on  the  forecasted  signal)  and  Pi  is  the  actual  starting  point.  The  results  are  summarized 
in  Table  3. 

Observe  Table  3,  the  performance  of  event  detection  based  on  the  forecasted  signal  from 
type-2  FLS  is  much  better  than  that  based  on  the  forecasted  signal  from  type-1  FLS.  Mean¬ 
while,  our  double  sliding  window  is  more  effective  than  the  existing  cumulated  signal  strength 
scheme.  Event  forecasting  helps  us  for  power  on/off  management  of  the  WSN,  i.e.  ,  we  can 
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power  on  the  communication  part  of  sensor  nodes  only  when  event  has  been  forecasted.  Since 
the  sensor,  signal  processing  parts  consume  less  than  1/10  of  the  energy  consumed  by  the 
communication  part  [1] ,  this  power  on/off  strategy  can  save  the  energy  tremendously. 

7  Conclusions 

In  this  paper,  we  proposed  an  event  forecasting  scheme  for  wireless  sensor  networks  using  in¬ 
terval  type-2  fuzzy  logic  system.  Our  event  forecasting  scheme  consists  of  two  steps:  sensed 
signal  strength  forecasting  and  event  detection.  We  demonstrated  that  real-world  sensed  acous¬ 
tic  signals  are  self-similar,  which  means  they  are  forecastable.  We  showed  that  a  type-2  fuzzy 
memebership  function  (MF),  i.e.,  a  Gaussian  MF  with  uncertain  mean  is  appropriate  to  model 
the  sensed  signal  strength  of  wireless  sensors.  We  then  applied  an  interval  type-2  FLS  to  per¬ 
form  sensed  signal  forecasting.  Furthermore,  we  proposed  a  double  sliding  window  for  event 
detection  based  on  the  forecasted  signal,  and  compared  it  against  the  existing  cumulated  signal 
strength  scheme.  Simulation  results  show  that  FLSs  can  be  used  for  sensed  signal  strength 
forecasting,  and  the  interval  type-2  FLS  performs  much  better  than  the  type-1  FLS  in  sensed 
signal  forecasting.  The  sensed  signal  forecasting  can  further  be  used  for  event  detection,  and 
the  average  absolute  error  between  the  actual  starting  point  and  the  point  detected  based  on 
the  sensed  signal  from  the  interval  type-2  FLS  is  much  smaller  than  the  one  based  on  the 
sensed  signal  from  the  type-2  FLS. 
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Table  1:  The  parameters  and  number  of  parameters  in  type-1  and  interval  type-2  FLSs. 
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type-1 

interval  type-2 

Parameters  in  one  antecedent 

mFik>  °Fji 

m^i  ,  772~z  ,  (T~i 
*fc1  -*-1  fc2 

Parameters  in  one  consequent 

t 

ybvi 

Total  number  of  Parameters 

144 

224 

Table  2:  Initial  values  of  the  parameters  in  type-1  and  interval  type-2  FLSs.  Each  antecedent 
is  described  by  two  fuzzy  sets. 


Type-1  FLS 

Interval  Type-2  FLS 

mean 

mt  -  2  <rt 
or  mt  +  2 at 

[mt  -  2.5 a(,mt  -  1.5crt] 
or  [mt  +  1.5<rt,mt  +  2.5 at\ 

a 

=  2ot 

Gyi  ~  2  <Jt 
tk 

consequent 

y1  E  [min,  max] 

y\  =  t-  <ru 
yir  =  yi  +  at 

Table  3:  Average  absolute  error  between  the  forecasted  and  actual  time  stamp  of  the  starting 
edge  of  events  in  type-1  FLS  and  interval  type-2  FLS.  Here,  1  stands  for  one  sample  or  1024ms, 
m  and  a  stands  for  the  mean  and  the  standard  deviation  of  the  cumulated  signal  strength  of 
the  training  data  respectively. 
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interval  type-2  FLS 

double  sliding  window 

7.8 

1.4 

SS  with  th  =  m 

57.8 
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SS  with  th  =  m  +  cr/2 

24.4 
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SS  with  th  =  m  +  a 
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16.6 
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Figure  1:  The  deployment  of  eight  sensor  nodes  in  our  experiments. 


Figure  2:  The  variance-time  plot  for  sensed  signal  strength  with  fixed  source  as  background 
during  3  hours.  The  sample  period  is  1024ms. 
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Figure  3:  The  variance-time  plot  for  sensed  signal  strength  without  fixed  source  during  3 
hours.  The  sample  period  is  1024ms. 


Figure  4:  (a)  Pictorial  representation  of  a  Gaussian  type-2  set.  The  secondary  memberships 
in  this  type-1  fuzzy  set  are  shown  in  (b),  and  are  Gaussian.  Note  that  this  set  is  called 
a  Gaussian  type-2  set  because  all  its  secondary  membership  functions  are  Gaussian.  The 
“principal”  membership  function  (the  bold  line) ,  which  is  triangular  in  this  case,  can  be  of  any 
shape. 
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Figure  5:  The  interval  type-2  MFs  with  fixed  std  and  uncertain  mean. 


TYPE-2  FUZZY  LOGIC  SYSTEM 


Figure  6:  The  structure  of  a  type-2  FLS.  In  order  to  emphasize  the  importance  of  the  type- 
reduced  set,  we  have  shown  two  outputs  for  the  type-2  FLS,  the  type-reduced  set  and  the  crisp 
defuzzified  value. 


Event 


(b) 


Figure  7:  The  response  of  the  double  sliding  window  event-detection  algorithm,  (a)  starting 
edge  of  the  event,  and  (b)  ending  edge  of  the  event. 
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Figure  8:  Sensed  data  for  1024  seconds.  (The  sample  period  is  1024ms). 
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Figure  9:  The  RMSE  of  sensed  signal  strength  forecasting  of  two  FLSs  averaged  over  100 
Monte-Carlo  realizaions. 
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Abstract — In  densely  deployed  wireless  sensor  networks, 
not  only  does  the  data  of  one  sensor  node  have  self-similarity, 
but  the  data  from  adjacent  sensor  nodes  also  have  cross¬ 
similarity.  Therefore,  it  is  clear  that  there  exists  highly 
redundancy  in  the  collected  data  from  sensor  nodes  in  the 
neighborhood.  Due  to  the  intrinsic  properties  wireless  sensor 
networks  have,  e.g.,  energy  constraint,  bandwidth  limitation, 
this  kind  of  information  redundancy  will  impact  the  whole 
networks  in  a  negative  way.  In  this  paper,  we  propose  to  use 
Singular-Value-QR  Decomposition  (SVD-QR)  to  reduce  the 
redundancy  in  wireless  sensor  networks. 

I.  Introduction 

Wireless  sensor  networking  is  an  emerging  technol¬ 
ogy  that  promises  unprecedented  ability  to  monitor  and 
manipulate  the  physical  world  via  a  network  of  densely 
distributed  wireless  sensor  nodes.  The  nodes  can  sense  the 
physical  environment  in  a  variety  of  modalities,  including 
acoustic,  seismic,  thermal,  and  infrared.  They  are  net¬ 
worked  together  in  an  ad  hoc  fashion,  which  involves  peer- 
to-peer  communication  in  a  network  with  a  dynamically 
changing  topology.  Wireless  sensor  networks  do  not  rely  on 
a  preexisting  fixed  infrastructure,  such  as  a  wireline  back¬ 
bone  network  or  a  base  station.  They  are  self-organizing 
entities  that  are  deployed  on  demand  in  support  of  various 
events  such  as  security  and  surveillance,  monitoring  of 
wildlife  habitats,  smart  sensor-instrumented  environments, 
and  condition-based  maintenance  of  complex  systems,  etc. 

Sensor  nodes  are  typically  powered  by  small  batteries 
that  are  hard  to  replace  or  recharge.  Hence,  how  to 
efficiently  use  the  sensor  nodes,  e.g.  ,  not  lose  essential 
information  but  extend  the  lifetime  of  the  nodes  as  long  as 
possible,  is  an  important  issue. 

Usually,  in  wireless  sensor  networks,  sensor  nodes  are 
densely  deployed,  e.g.,  tens  of  sensor  nodes  per  square 
meters  [6],  therefore  the  information  data  collected  from 
adjacent  sensor  nodes  might  be  very  similar  with  each 
other,  that  also  means  there  exists  redundancy  among 
those  information.  Taking  advantage  of  this  property,  we 
propose  to  reduce  the  redundancy  so  as  to  prolong  the 
lifetime  of  the  whole  networks  by  using  Singular-Value- 
QR  Decomposition  (SVD-QR). 


In  this  paper,  we  use  Xbow  wireless  sensor  network  pro¬ 
fessional  developer’s  kit  MOTE-Kit[5]  as  our  testbed  to  get 
data  sets  from  different  scenarios.  In  the  following  sections, 
Section  II  studied  the  self-similarity  of  sensed  data;  the 
redundancy  reduction  for  wireless  sensor  networks  using 
SVD-QR  is  presented  in  Section  III;  and  conclusions  and 
future  works  are  provided  in  Section  IV. 

II.  Self-Similarity  of  Sensor  Network  Data 

For  a  detailed  discussion  on  self-similarity  in  time- 
series,  see  [8]  [7],  Here  we  briefly  present  its  defini¬ 
tion  [4]. Given  a  zero-mean,  stationary  time-series  X  = 
(Xt;t  =  1,2, 3, •••),  we  define  the  m-aggregated  series 
X =  (X^ ;  k  =  1,2,3,--  -)  by  summing  the  original 
series  X  over  nonoverlapping  blocks  of  size  to.  Then  it’s 
said  that  X  is  il-self-similar,  if,  for  all  positive  to,  X^ 
has  the  same  distribution  as  X  rescaled  by  mH.  That  is, 

tin 

Xt  =  m-H  Xi  VmeN  (1) 

i=(i— l)m+l 

If  X  is  //-self-similar,  it  has  the  same  autocorrelation  func¬ 
tion  r{k)  =  E[(Xt  -  p){Xt+k  -  p)]/cr2  as  the  series  X (-m'> 
for  all  to,  which  means  that  the  series  is  distributionally 
self-similar:  the  distribution  of  the  aggregated  series  is  the 
same  as  that  of  the  original. 

Self-similar  processes  can  show  long-range  dependence. 
A  process  with  long-range  dependence  has  an  autocorre¬ 
lation  function  r(k)  ~  k~&  as  k  — ►  oo,  where  0  <  /3  <  1. 
The  degree  of  self-similarity  can  be  expressed  using  Hurst 
parameter  H  =  1  —  0/2.  For  self-similar  series  with  long- 
range  dependence,  1/2  <  H  <  1.  As  H  — +  1,  the  degree  of 
both  self-similarity  and  long-range  dependence  increases. 

One  method  that  has  been  widely  used  to  verify  self¬ 
similarity  is  the  variance-time  plot,  which  relies  on  the 
slowly  decaying  variance  of  a  self-similar  series.  The 
variance  of  X ^  is  plotted  against  to  on  a  log-log  plot, 
and  a  straight  line  with  slope  (— /?)  greater  than  —1  is 
indicative  of  self-similarity,  and  the  parameter  H  is  given 


by  H  —  1  —  P/2.  We  use  this  method  to  verify  the  self¬ 
similarity  of  acoustic  signal. 

In  our  experiments,  8  sensors  were  deployed  in  a  lab. 
The  location  of  the  sensors  is  showed  in  Fig.  1.  We 
designed  two  scenarios,  one  is  with  a  fixed  source,  and 
the  other  is  without.  In  Fig.  2,  we  plot  the  variance  of 
X(m)  against  m  on  a  log-log  plot  for  8  sensor  nodes 
respectively  in  the  first  scenario  and  Fig.  3  is  under  the 
second  scenario.  In  order  to  prove  that  the  data  from 
all  the  sensor  nodes  have  self-similarity  as  well,  we 
mixed  the  data  sets  together  to  get  a  new  time  series  as 
Y  =  (X},X?,---,Xf,t  =  1, 2, 3,  •  •  •)•  We  test  its  self¬ 
similarity  by  plotting  the  variance-time  curve  in  Fig  4  as 
well.  From  these  three  figures,  it’s  very  clear  that  the  no 
matter  under  what  kind  of  condition  both  the  single  sensor 
network  data  and  the  mixed  sensor  networks  data  have  self¬ 
similarity  because  their  traces  have  slopes  much  greater 
than  -1. 
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Fig.  1.  The  deployment  of  the  eight  sensor  nodes  in  our  experiments. 


Fig.  2.  The  variance-time  plot  for  sensed  signal  strength  with  fixed 
source  as  background  during  3  hours.  The  sampling  period  is  1024ms. 


Fig.  3.  The  variance-time  plot  for  sensed  signal  strength  without  fixed 
source  during  3  hours.  The  sampling  period  is  1024ms. 


III.  Redundancy  Reduction  in  Wireless  Sensor 
Networks  Using  SVD-QR 

In  the  previous  section,  we  have  proved  that  the  data 
sets  collected  by  adjacent  sensor  nodes  are  quite  similar 
with  each  other.  It  is  clear  that  there  exists  redundancy 
among  the  collected  information.  Therefore,  two  questions 
are  popping  up.  Is  such  kind  of  redundancy  profitable? 
Does  more  copies  of  the  data  set  mean  better  estimates? 
The  answers  are  both  no.  The  goal  of  wireless  sensor 
networks  is  to  monitor  the  physical  world,  provide  enough 
information  in  which  users  are  interested  so  that  users  can 
perform  further  tasks,  eg.,  events  detection,  targets  esti¬ 
mating  and  tracking.  Blair  and  Bar-Shalom  have  already 
demonstrated  in  [9]  that  more  data  from  more  sensor  nodes 
doesn’t  mean  better  performance  in  terms  of  the  maximum 
root-mean  square  errors(RMSE).  Meanwhile,  if  we  can  get 


enough  information  from  less  sensor  nodes,  we  can  turn 
off  the  other  sensor  nodes  so  as  to  preserve  energy  and 
prolong  the  lifetime  of  the  whole  networks. 

How  to  select  the  principal  nodes  to  effectively  represent 
the  whole  neighborhood?  We  view  the  data  from  all  the 
adjacent  sensor  nodes  as  a  matrix  P,  each  column  of  P  is 
the  data  from  one  sensor  node,  each  row  of  P  is  the  data 
collected  at  one  epoch  from  all  the  sensor  nodes.  Therefore, 
the  principal  nodes  picking  problem  can  be  simplified  as 
subset  selection. 

Several  subset  selection  methods  exist  [1],  but  a  singular 
value  decomposition  (SVD)  method  is  preferable  in  rank 
deficient  problems  [2],  Furthermore,  the  SVD  provides 
a  natural  way  to  separate  a  space  into  dominant  and 
subdominant  subspaces.  If  we  view  the  data  matrix  P  as 
a  span  of  the  input  subspace,  then  the  SVD  decomposes 


Fig.  4.  The  variance-time  plot  for  mixed  sensor  data  during  3  hours. 
The  sampling  period  is  1 024ms. 

the  span  into  an  equivalent  orthogonal  span,  from  which 
we  can  identify  the  dominant  and  subdominant  spans.  In 
this  way,  we  solve  two  problems  simultaneously:  (i)  we 
estimate  the  data  sets  from  how  many  sensor  nodes  are 
needed  to  effectively  represent  the  neighborhood,  and,  (ii) 
we  identify  the  data  sets  from  which  sensor  nodes  are  the 
principal  ones.  The  remainder  can  be  discarded,  and  those 
sensor  nodes  can  be  turned  off  to  conserve  the  energy. 

A.  Introduction  of  SVD-QR  Algorithm 

Here,  we  use  the  following  SVD-QR  algorithm  that  is 
similar  to  the  one  in  [2]  and  [3]  to  select  a  set  of 
independent  data  sets  that  minimize  the  residual  error  in  a 
least-squares  sense: 

1)  Given  P  e  RNxM,  assume  N  >  M,  and 
rank(P)  —  r  <  M  denote  the  rank  of  P.  Determine 
a  numerical  estimate  r  of  the  rank  of  the  data  sets 
matrix  P  by  calculating  the  singular  value  decom¬ 
position 

P  =  U 

where,  U  is  an  N  x  N  matrix  of  orthonormalized 
eigenvectors  of  PPT,  V  is  an  M  x  M  matrix  of 
orthonormalized  eigenvectors  of  PTP  ,  and  E  is  the 
diagonal  matrix  E  =  diag{o\ ,  <72, . . .  ,ay),  where  er* 
denotes  the  ith  singular  value  of  P,  and  or  >  <t2  > 

■  •  •  >  ar  >  0.  Select  f  <r  . 

2)  Calculate  a  permutation  matrix  n  such  that  the 
columns  of  the  matrix  Ti  €  RNxf  in 

pn  =  [ri,r2]  (3) 

are  independent.  The  permutation  matrix  II  is  ob¬ 
tained  from  the  QR  decomposition  of  the  subma¬ 
trix  comprised  of  the  right  singular  vectors,  which 


correspond  to  the  f  ordered  most-significant  singular 
values. 

In  short,  we  select  the  data  sets  as  the  following: 


•  Decomposes  P,  from  the  SVD  of  P,  save  V. 

•  Observe  E.  Select  an  appropriate  f. 

•  Partition 


V  = 


Vu  VU 
V21  V22 


(4) 


where  V„  6  Rfxf,  Vn  G  #*&-*),  yn  g 

R(M-f)*f  and  y22  G  In  many 

practical  cases,  oy  is  much  larger  than  ay ;  thus  f 
can  be  chosen  much  smaller  than  the  estimate  r  of 
rank(P),  even  1. 

•  Using  QR  decomposition  with  column  pivoting,  de¬ 
termine  II  such  that 


QT[V1T1,Vl}U  =  [Ru,Ri2},  (5) 

where  Q  is  a  unitary  matrix,  and  Pn  and  Pi  2  form 
an  upper  triangular  matrix;  and  n  is  the  permutation 
matrix,  the  column  permutation  II  is  chosen  so  that 
abs(diag(R))  is  decreasing.  In  short,  II  corresponds 
to  the  f  ordered  most-significant  sets. 


B.  An  example  of  the  SVD-QR  decomposition  in  Redun¬ 
dancy  Reduction 

Here,  we  give  an  example  of  how  to  use  SVD-QR 
decomposition  to  reduce  the  redundancy  in  wireless  sensor 
networks,  i.e.,  determine  how  many  sensors  of  data  should 
be  selected. 

We  use  the  data  sets  which  also  has  been  used  in 
Section  II,  and  get  one  clip,  i.e.,  8  sensor  nodes,  each  one 
has  100  samples  of  data,  as  the  input  of  the  following 
example. 


Example  1 

•  step  1.  SVD  the  input  matrix  P,  get: 
diag{  E)  =  (14160, 74, 20, 14, 13, 10, 9, 7); 

Clearly,  E(l,l)  is  much  larger  than  E(2,2).  That 
means  we  can  only  select  one  data  set  to  represent 
all  the  eight  sets  of  data,  i.e.,  f  =  1. 

•  step  2.  Partition  the  V,  and  get  Vu  and  V21,  which 
are  needed  in  QR  decomposition, 

Vn  =  -0.3565,  and 
'  -0.3556  ‘ 

-0.3535 

-0.3512 

V21  =  -0.3546  . 

-0.3526 
-0.3540 
I  -0.3503  J 

•  step  3.  using  QR  decomposition  with  column  pivoting 
to  determine  the  economy  matrix  IT.  Since  in  this 


example,  f  =  1,  we  only  care  about  the  first  column 
of  n, 

'  1  ‘ 

0 

0 


0 

0 

0 

That  means  the  first  column  of  the  input  matrix  P, 
i.e.,  the  data  collected  from  the  first  sensor  node  is  the 
most-significant  one,  which  can  effectively  represent 
all  the  eight  sensor  nodes  in  the  neighborhood,  jj 


Example  2 

What  if  we  select  more  than  one  set  of  data?  We  have 
the  following  example  to  explain.  We  get  another  clip  of 
data,  still  has  8  sensor  nodes,  each  one  has  100  samples 
of  data,  as  the  input. 

•  step  1.  SVD  the  input  matrix  P,  get: 
diag{  £)  =  (14759, 368, 275, 200, 186, 146, 97, 68). 
Oberserve  £,  the  decreasing  scope  from  £(1,1)  to 
£(2, 2)  is  not  as  large  as  it  is  in  Example  1  .  So,  we 
have  f  =  2. 


step  2. 
Vn  = 


V21  = 


Partition  the  V,  and  get  Vn  and  V21, 
and 


-0.3503 

0.1919 

-0.3582 

-0.1618 

-0.3528 

0.3369 

-0.3570 

-0.8685 

-0.3580 

-0.1417 

-0.3525 

0.0853 

-0.3585 

0.1548 

-0.3406 

0.1338 

step  3.  Using  QR  decomposition  with  column  pivoting 
to  determine  the  economy  matrix  II.  Since  f  —  2,  we 
only  care  about  the  two  column  of  II, 


II(:,1:2)  = 


0  0 
0  0 
0  1 
1  0 
0  0 
0  0 
0  0 
0  0 


That  means  the  forth  and  third  columns  of  the  input 
matrix  P,  i.e.,  the  data  collected  from  the  forth  and 
third  sensor  nodes  are  the  most-significant  ones,  which 
can  effectively  represent  all  the  eight  sensor  nodes  in 
the  neighborhood,  (j 

From  our  plenty  of  simulations,  the  significant  one(s)  are 
changing,  that  depends  on  the  change  of  the  environment. 
However,  we  can  define  a  coherent  time  ,  in  this  time,  the 
environment  is  assumed  to  keep  stable  to  a  certain  extend. 


IV.  Conclusions  and  Future  Works 

In  this  paper,  we  used  MOTE-Kit[5]  testbed  to  collect 
the  real  data  sets  from  different  scenarios.  First,  we  proved 
there  exists  not  only  self-similarity  in  the  data  from  one 
sensor  node,  but  cross-similarity  among  the  data  of  all  the 
adjacent  sensor  nodes  also.  That  demonstrated  that  there 
exists  redundancy  in  the  collected  data  of  the  wireless 
sensor  networks.  Taking  energy  efficiency  and  better  per¬ 
formance  into  consideration,  we  proposed  to  use  SVD-QR 
to  select  the  principal  data  sets  from  particular  sensor  nodes 
to  represent  the  all  the  sensor  nodes  in  the  neighborhood 
effectively.  We  gave  two  examples  to  show  how  to  do  it. 

The  future  work  includes  theoretical  analysis  on  how 
much  information  loss  after  we  reduce  the  redundancy? 
Does  this  loss  affect  the  performance  of  the  wireless  sensor 
networks? 
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Abstract  -  Wireless  Sensor  Networks  (WSN)  are  designed  to 
monitor  physical  phenomena.  The  main  task  of  WSN  is  to  per¬ 
form  event  detection,  tracking,  and  classification.  So,  com¬ 
pared  with  traditional  ad-hoc  networks,  WSN  is  event-centric. 
Therefore,  an  important  question  in  WSN  is  to  detect  events.  In 
this  paper,  we  present  two  methods  to  do  event  detection,  one 
is  Double  Sliding  Window  Detection,  the  other  one  is  Fuzzy 
Logic  approach.  The  accuracy  of  the  results  is  established  via 
sensor  network  testbed  and  simulations. 

Keywords  -  Wireless  sensor  networks,  fuzzy  logic  systems, 
event  detection. 


I.  INTRODUCTION 


The  infusion  and  maturation  of  the  Micro  Mechanical  Sys- 
tem(MEMS),  computations,  and  wireless  communication  tech¬ 
nologies  has  advanced  the  development  of  Wireless  Sensor 
Networks  (WSN).  In  WSN,  a  large  amount  of  low  cost  sen¬ 
sor  nodes  are  densely  deployed  to  monitor  the  environment  of 
interest.  Due  to  the  various  applications  [2]  [3],  WSN  has  gen¬ 
erated  flurry  of  research  activity. 

Sensor  nodes  are  typically  powered  by  small  batteries  that 
are  hard  to  replace  or  recharge.  Hence,  energy  constraint  is 
a  unique  character  of  WSN  compared  with  traditional  wire¬ 
less  ad-hoc  networks.  Energy  comsuption  occurs  in  three  do¬ 
mains:  sensing,  data  processing  (including  AD/DA  and  digital 
signal  processing),  and  communications[5].  According  to[l], 
the  sensor,  signal  processing  parts  operate  at  low  frequency 
and  consume  less  than  lmW.  This  is  over  an  order  of  magni¬ 
tude  less  than  the  energy  consumption  of  the  communication 
part.  Therefore,  we  prefer  less  communication/data  exchange 
between  sensor  nodes  but  more  local  processing  implemented 
by  one  single  sensor  node  so  as  to  increase  the  lifetime  of  the 
WSN. 

The  main  goal  of  WSN  is  to  monitor  physical  world.  Usu¬ 
ally,  people  are  more  intereted  in  unexpected  events.  For  ex¬ 
ample,  in  a  scenario  of  battlefield,  people  are  more  interested  in 
the  appearance  of  enemies.  If  a  WSN  is  to  monitor  forest-fire, 


unusual  increasing  of  the  temperature  should  be  a  necessary 
warning  to  people.  Both  the  appearance  of  enemies  and  the 
unusual  increasing  of  the  temperature  can  be  seen  as  events. 
Because  of  the  energy  constraint  of  WSN  mentioned  previ¬ 
ously,  the  ideal  state  of  WSN  should  be  event-driven,  so  that 
we  can  power  off  the  communication  part  at  most  of  the  time. 
Only  when  certain  sensor  nodes  detect  an  event,  they  trigger 
the  RF  channel,  and  transmit  the  useful  information  to  bases- 
tation  or  headquarters.  Therefore,  event-detection  is  one  of  the 
key  issues  for  WSN. 

In  this  paper  we  present  two  approaches  of  event-detection 
for  WSN,  double  sliding  window  and  hybrid  event-detection 
using  fuzzy  logic  system.  We  use  Berkely  MICA2  motes[4]  as 
our  testbed  and  evaluate  the  event-detection  approaches  based 
on  the  acoustic  data  collected  by  the  testbed  in  different  exper¬ 
iments. 

The  remainder  of  the  paper  is  organized  as  follows,  the 
sensor  model  is  given  in  Section  II.  The  double  sliding  win¬ 
dow  and  hybrid  event-detection  based  on  fuzzy  logic  sytem 
approaches  are  presented  in  Section  III  and  Section  IV  respec¬ 
tively.  Simulation  results  and  discussions  are  presented  in  sec¬ 
tion  V.  Section  VI  concludes  this  paper. 


II.  ACOUSTIC  SENSOR  MODEL 


Acoustic  amplitude  sensor  node  measures  sound  amplitude 
at  the  microphone.  Assuming  that  the  sound  source  is  a  point 
source  and  sound  propagation  is  lossless  and  isotropic,  a  root- 
mean-squared  (RMS)  amplitude  measurement  z  is  related  to 
the  sound  source  position  X  as 


where  a  is  the  RMS  amplitude  of  the  sound  source,  c  is  the 
location  of  the  sensor,  and  w  is  RMS  measurement  noise  [6].  In 
this  paper,  we  use  Xbow  wireless  sensor  network  professional 
developer’s  kit  MOTE-Kit  for  data  collection. 


III.  DOUBLE  SLIDING  WINDOW  EVENT-DETECTION 


In  [9], the  acoustic  energy  in  a  fixed  period  of  time  is  inte¬ 
grated,  when  it  exceeds  a  threshold,  the  authors  claim  a  detec¬ 
tion  of  event  occurred,  as: 

M  —  l 

Es=Y1  k-m|2  ’  (2) 

m=0 

E$  ^  E threshold *  (3) 

However,  this  simple  method  suffers  from  a  significant  draw¬ 
back;  namely,  the  value  of  the  threshold  depends  on  the  sensed 
signal  energy.  When  there  is  no  event  occuring  in  the  sensing 
range,  the  sensed  signal  consists  of  only  noise.  The  level  of 
the  noise  power  is  generally  unknown  and  can  change  when 
the  environment  changes  or  if  unwanted  interferes  go  on  and 
off.  Therefore,  it  is  quite  difficult  to  set  a  fixed  threshold.  We 
design  a  double  sliding  window  algorithm  for  event-detection 
so  as  to  alleviate  the  threshold  value  selection  problem. 

The  double  sliding  window  event-detection  algorithm  cal¬ 
culates  two  consecutive  sliding  windows  of  the  sensed  signal 
energy.  The  basic  principle  is  to  form  the  decision  variable  as 
the  ratio  of  the  total  energy  contained  inside  the  two  windows. 
Figure  1  shows  the  windows  A  and  B  and  the  response  of  the 
ratio  m„  to  a  sensed  event.  It  can  be  seen  that  when  only  noise 
is  sensed  the  response  is  flat,  since  both  windows  contain  ide¬ 
ally  the  same  amount  of  noise  energy. 


Fig.  1.  The  response  of  the  double  sliding  window  event-detection  algorithm. 


The  advantage  of  this  approach  is  the  decision  variable  mn 
does  not  depend  on  the  sensed  signal  energy,  but  on  the  ratio 
of  the  energy  of  two  consecutive  windows. 


IV.  HYBRID  EVENT-DETECTION  BASED  ON  FUZZY 
LOGIC  SYSTEM 


Using  the  double  sliding  window  algorith  to  do  event- 
detection  is  a  good  approach.  However,  if  an  event  contin¬ 
uously  appears  in  the  sensing  range  of  a  node,  the  ratio  mn 
will  still  be  flat.  The  probability  of  detection  will  decrease  ac¬ 
cordingly.  In  order  to  solve  this  problem,  we  present  a  hybrid 
event-detection  algorithm  based  on  fuzzy  logic  system. 

A.  Overview  of  Fuzzy  Logic  Systems 


Figure  2  shows  the  structure  of  a  fuzzy  logic  system 
(FLS)  [7].  When  an  input  is  applied  to  a  FLS,  the  inference  en¬ 
gine  computes  the  output  set  corresponding  to  each  rule.  The 
defuzzifer  then  computes  a  crisp  output  from  these  rule  output 
sets.  Consider  a  p-input  1  -output  FLS,  using  singleton  fuzzifi¬ 
cation,  height  defuzzification  [7]  and  “IF-THEN”  rules  of  the 
form  [8] 

Rl  :  IF  X\  is  F[  and  X2  is  and  ■  •  •  and  xp  is  Fp,  THEN  y  is 
Gl. 

Assuming  singleton  fuzzification,  when  an  input  x'  = 
{x'lf . . . ,  x'p]  is  applied,  the  degree  of  firing  corresponding  to 
the  1th  rule  is  computed  as 

Ff[  (x'i)  *  Ff‘  (4)  * ' ' *  Mf'  (4)  =  VLiVfI  (*i)  (7) 

where  *  and  T  both  indicate  the  chosen  f-norm.  There  are 
many  kinds  of  defuzzifiers.  In  this  paper,  we  focus,  for  illus¬ 
trative  purposes,  on  the  height  defuzzifier  [7],  It  computes  a 
crisp  output  for  the  FLS  by  first  obtaining  the  height,  yl,  of 
every  consequent  set  Gl,  and,  then  computing  a  weighted  aver¬ 
age  of  these  heights.  The  weight  corresponding  to  the  Ith  rule 
consequent  height  is  the  degree  of  firing  associated  with  the  fth 
rule,  T?=lpFi  (x'^ ,  so  that 


2/h(x')  = 


E(=i  VlVLi Ff>  (zi) 


(8) 


where  M  is  the  number  of  rules  in  the  FLS.  In  this  paper,  we 
design  a  FLS  for  event-detection  of  WSN. 


B.  Hybrid  event-detection  algorithm 


We  have  two  inputs  for  the  FLS  -.the  accumulated  signal  en¬ 
ergy  Es  in  a  fixed  period  of  time  and  the  ratio  of  the  accumu¬ 
lated  signal  energy  in  two  consecutive  sliding  windows  mn. 
The  linguistic  variables  used  to  represent  them  were  divided 
into  three  levels:  low,  moderate ,  and  high.  The  consequent 
-  the  possibility  that  an  event  occurs  -  was  divided  into  five 


FUZZY  LOGIC  SYSTEM 


Fig.  2.  The  structure  of  a  fuzzy  logic  system. 


levels, very  strong,  strong  ,  medium  ,  weak  and  very  weak.  We 
used  trapezoidal  membership  functions  (MFs)  to  represent  low, 
high,  very  strong,  very  weak,  and  triangle  MFs  to  represent 
moderate,  medium  ,  strong,  weak.  We  show  these  MFs  in  Fig¬ 
ure  3(a)  and  3(b). 

Based  on  the  fact  that  when  event  occurrs,  Es  or  mn  should 
be  high.  We  design  a  fuzzy  logic  system  using  rules  such  as: 

Rl  :  IF  Es  is  F(  and  mn  is  Fj,  THEN  the  possibility  that 
there  is  event  ( y )  is  Gl. 

where  l  —  1,  •  •  • ,  9.  We  summarize  all  the  rules  in  Table  1. 
Table  1.  Rules  forevent-detection.  Antecedent  1  is  Es,  Antecedent  2  is  m„. 


Rule ' 

Antecedent  1 

Antecedent  2 

Consequent 

1 

low 

low 

very  weak 

2 

low 

mod 

weak 

3 

low 

high 

mod 

4 

mod 

low 

weak 

5 

mod 

mod 

mod 

6 

mod 

high 

strong 

7 

high 

low 

mod 

8 

high 

mod 

strong 

9 

high 

high 

very  strong 

V.  SIMULATIONS 


Figure  4  shows  the  basic  data  set,  which  was  colletecd  from 
Berkely  MICA2  motes,  we  used  in  our  simulations.  In  order 
to  get  the  probability  of  detection  P  —  d,  and  probability  of 
false  alarm  P  —  /,  white  Gaussian  Noise  is  added,  SNR  is 
10 dB.  We  ran  100,000  Monte-Carlo  simulations.  The  results 
of  each  algorithms  are  summarized  in  Table  2.  Obviously, 
in  terms  of  both  P ’<*  and  P;,  the  performances  of  both  Dou¬ 
ble  Sliding  Window  scheme  and  hybrid  event-detection  algo¬ 
rithm  based  on  FLS  are  much  better  than  that  of  signal  strength 
event-detection  algorithm. 


Fig.  3.  MFs  used  to  represent  the  linguistic  labels,  (a)  MFs  for  antecedent, 
and  (b)  MFs  for  consequent. 


VI.  CONCLUSIONS 


In  this  paper,  we  proposed  two  event-detection  algorithms 
in  Wireless  Sensor  Networks,  Double  Sliding  Window  scheme 
and  hybrid  approach  based  on  Fuzzy  Logic  System.  We  use 
the  basic  data  set  collected  by  MOTE-Kit[4]  testbed  and  white 
Gaussian  Noise  is  added.  Simulation  results  show  that  both 
the  Double  Sliding  Window  and  the  hybrid  scheme  based  on 
FLS  outperform  the  existing  Signal  Strength  event-detection 
algorithm  in  terms  of  both  the  probability  of  detection  and  the 
probalibity  of  false  alarm. 
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Signal  Strength  event-detection 

69.75% 

0.08% 

Double  Sliding  Window  event-detection 

91.499% 

0.02% 

Hybrid  event-detection  based  on  FLS 

99.97% 

0.05% 
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Abstract — Energy-aware  technique  to  reduce  energy  consump¬ 
tion  in  distributed  sensor  networks  has  become  a  prominent  topic 
in  sensor  network  research.  Various  sensor  network  applications 
have  taken  energy  efficiency  into  consideration.  In  the  case  of 
correlated  binary  sources,  distributed  source  coding  has  been 
literally  studied  in  information  theory.  However,  data  sources 
from  real  sensor  networks  are  normally  non-binary.  In  this  paper, 
we  proposed  a  spectrum  efficient  coding  scheme  for  correlated 
non-binary  sources  in  sensor  networks.  Our  approach  constructs 
the  codeword  cosets  for  the  interested  source,  taking  advantage 
of  statistical  characters  of  the  distinct  observations  from  sensor 
nodes.  The  coset  leaders  are  then  transmitted  via  the  channel 
and  decoding  is  performed  with  the  available  side  information. 
Simulations  are  carried  out  over  independent  and  identically 
distributed  (i.i.d)  Gaussian  sources  and  data  collected  from  Xbow 
wireless  sensor  network  test  bed.  Simulation  results  show  that  the 
proposed  scheme  performs  at  0.5  -  1.5  dB  from  the  Wyner-Ziv 
distortion  bound. 

I.  Introduction 

Wireless  sensor  network  consists  of  certain  amount  of  small 
and  energy  constrained  nodes.  Such  networks  are  normally 
deployed  for  data  collection  where  human  intervention  after 
deployment,  to  recharge  or  replace  node  batteries  may  not  be 
feasible,  resulting  in  limited  network  lifetime.  Failure  of  an 
amount  of  sensors  due  to  energy  depletion  has  a  significant 
impact  on  the  functioning  of  the  entire  wireless  sensor  net¬ 
works. 

Various  research  has  been  done  to  alleviate  the  energy  con¬ 
sumption  in  wireless  sensor  networks,  from  hardware  design 
of  individual  sensor  to  routing  and  topology  construction  of 
the  whole  network.  Among  which,  one  distinct  technology 
for  energy-efficient  wireless  sensor  networks  is  distributed 
source  coding  (DSC)  [1],  [2].  DSC  was  proposed  to  encode 
the  correlated  sensor  readings  separately,  i.e.  sensors  encoding 
the  readings  do  not  communicate  with  each  other.  After  the 
distributed  encoding,  the  compressed  data  is  sent  to  a  central 
hub  node  for  joint  decoding.  Further  research  on  this  topic 
demonstrated  that  convolutional  codes  [3],  Turbo  and  LDPC 
codes  [4],  [5]  performed  well  in  distributed  compression  for 
sensor  networks.  All  these  approaches  are  based  on  binary 
distributed  sources  with  refined  correlation  to  each  other. 
However,  in  a  practical  sensor  network  or  even  in  a  lab  test  bed 
of  wireless  sensor  network,  the  distributed  deployed  sensors 


have  very  rough  readings  which  can  hardly  be  fitted  into  the 
above  binary  compressing  schemes. 

In  this  paper,  we  address  the  spectrum  efficient  coding 
scheme  for  correlated  non-binary  sources  in  wireless  sensor 
networks.  Our  approach  attempts  to  provide  a  solution  to 
Chief  Executive  Officer  (CEO)  problem.  The  goal  of  the  CEO 
problem  is  to  recover  as  much  information  as  possible  about 
the  actual  event  from  the  noisy  observations,  while  minimizing 
the  total  information  rate.  We  propose  to  exploit  the  statistical 
characters  of  real  sensor  readings  before  constructing  code¬ 
word  cosets.  From  the  approximate  Gaussian  readings,  Lloyd- 
Max  quantization  is  applied  to  minimize  the  mean  square 
distortion.  To  save  communication  spectrum,  a  coset  encoder  is 
designed  to  reduce  the  transmitted  bits  based  on  the  probability 
distribution  of  quantized  values.  We  show  that  source  encoding 
can  be  completed  in  a  iully  distributed  way.  Each  sensor 
encodes  its  own  readings  without  knowing  what  the  other 
sensors  have  measured.  Our  work  differs  from  previous  ones 
not  only  in  the  non-binary  sources  but  in  proposing  a  practical 
coset  encoding  scheme  for  real  senor  readings.  Simulations 
are  carried  out  over  independent  and  identically  distributed 
(i.i.d)  Gaussian  sources  and  data  collected  from  Xbow  wireless 
sensor  network  test  bed.  Simulation  results  show  that  the 
proposed  scheme  performs  at  0.5  - 1 .5  dB  from  the  Wyner-Ziv 
distortion  bound. 

This  paper  is  organized  as  follows.  In  section  II,  we  briefly 
review  the  basic  concept  of  distributed  source  coding  for  cor¬ 
related  information.  Section  III  discusses  the  intuition  behind 
our  approach.  Section  IV  details  the  coset  construction  based 
on  the  statistical  knowledge  of  sensor  readings.  Simulation 
results  are  presented  in  Section  V.  Section  VI  concludes  with 
a  summary. 

II.  Prelimiaries 

In  this  section,  we  review  the  basic  concepts  of  dis¬ 
tributed  source  coding  for  correlated  information  and  introduce 
Slepian-Wolf  coding  for  lossless  source  coding  and  Wyner-Ziv 
coding  for  the  lossy  case. 

Consider  a  distributed  wireless  sensor  network  consisting 
of  individual  sensors  that  monitor  the  sensor  field.  These 
sensors  transmit  their  highly  correlated  data  to  a  central 
hub  node  to  reconstruct  the  observations.  Transmission  of 


redundant  information  can  be  easily  avoided  if  the  sensors 
communicate  with  each  other  but  such  inter-node  cooperation 
requires  higher  bandwidth  and  consumes  more  energy  in 
communication.  Slepian  and  Wolf  in  [6]  proved  that  if  no 
communication  among  the  sensors,  theoretically  there  was  no 
loss  in  performance  under  certain  conditions.  After  [6]  the 
Slepian- Wolf  theorem  has  been  extended  to  the  lossy  coding 
of  continuous- valued  sources  by  Wyner  and  Ziv  [7]. 

A.  Slepian-Wolf  Coding 

Let  X  and  Y  be  two  correlated  independent  and  identically 
distributed  (i.i.d)  binary  sources.  For  lossless  compression 
with  X'  =  X  and  Y'  —  Y  after  decompression,  we  know 
from  Shannon’s  source  coding  theory  [8]  that  a  rate  given  by 
the  joint  entropy  H(X,  Y)  of  X  and  Y  is  sufficient  if  we  are 
encoding  them  together. 

Fig.  1  gives  an  example  of  joint  encoding  and  distributed 
encoding  of  two  binary  sources.  In  Fig.  1  (a),  encoder  X 
compress  X  into  H(X)  bits  per  sample  and  based  on  the 
complete  knowledge  of  X  at  both  encoder  and  decoder,  Y 
is  then  compressed  into  H{Y\X)  bits  per  sample,  while  in 
Fig.  1  (b),  encoder  X  and  Y  do  not  communicate  and  perform 
separate  encoding. 


<«) 


(b) 


Fig.  1.  Correlated  source  coding  configuration,  (a)  Joint  encoding  of  X 
and  Y.  The  encoders  communicate  with  each  other  and  a  rate  H(X,  Y) 
is  sufficient,  (b)  Distributed  encoding  of  X  and  Y.  The  encoders  do  not 
communicate.  Slepian-Wolf  theorem  proved  that  H(X ,  Y)  is  also  sufficient. 

The  Slepian-Wolf  theorem  [6]  states  that  if  X  and  Y  are 
correlated  according  to  some  arbitrary  probability  distribution 
p(x,  y ),  then  X  can  be  compressed  separately  (without  access 
to  Y)  without  losing  performance  comparing  to  the  condition 
in  Fig.  1  (a).  It  says  that  the  achievable  region  of  DSC  for 
discrete  sources  X  and  Y  is  given  by  Rx  >  H(X\Y),  Ry  > 
H(Y\X)  and  Rx+Ry  >  H(X,  y),  which  is  shown  in  Fig.  2. 

For  practical  Slepian-Wolf  coding,  the  first  attemp  is  to 
approach  the  comer  point  A  in  the  Slepian-Wolf  rate  region 
of  Fig.  2  with  Ri+  R2  =  H(X\Y)  +  H(Y)  =  H(X,Y). 
This  is  actually  a  problem  of  source  coding  of  X  with  side 
information  Y  at  the  decoder  as  shown  in  Fig.  3.  Similarly  the 
other  comer  point  B  of  the  Slepian-Wolf  rate  region  can  be 


Point  A:  compression  of  X  with  side  information  Y  at  the  joint  decoder 
Fig.  2.  The  Slepian-Wolf  region  for  two  binary  sources 

approached  by  exchanging  the  roles  of  X  and  Y  and  all  points 
between  the  two  comer  points  can  be  realized  by  time-sharing. 
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Fig.  3.  One  example  of  Slepian-Wolf  coding:  Lossless  source  coding  with 
side  information  at  the  decoder 

B.  Wyner-Ziv  Coding 

Slepian-Wolf  scheme  focused  on  lossless  source  coding  of 
discrete  sources  with  side  information  at  the  decoder.  How¬ 
ever  most  sensor  network  applications  deal  with  continuous 
sources,  the  rate  distortion  with  side  information  at  the  decoder 
thus  becomes  a  big  concern.  The  problem  to  solve  in  the 
lossy  source  coding  is  how  many  bits  are  needed  to  encode 
X  under  the  constraint  that  the  average  distortion  between  X 
and  A' is  E[d(X,  X)}  <  D,  assuming  the  side  information  Y 
is  available  only  at  the  decoder. 

Wyner  and  Ziv  [7]  first  considered  this  problem  and  gave 
the  rate-distortion  function  for  both  discrete  and 

continuous  cases  and  general  distortion  metrics  d(.).  Fig.  4 
is  an  illustration  of  Wyner-Ziv  coding.  In  general,  Wyner-Ziv 
coding  set  up  the  Slepian-Wolf  coding  in  that  coding  of  X  is 
with  respect  to  a  fidelity  criterion  rather  than  lossless. 
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Fig.  4.  Wyner-Ziv  coding  or  lossy  source  coding  with  side  information 

But  the  important  thing  about  Wyner-Ziv  coding  is  that  it 
normally  suffers  rate  loss  when  compared  to  lossy  coding  of  X 


as  the  side  information  Y  is  available  at  both  the  encoder  and 
decoder.  One  exception  is  when  X  and  Y  are  jointly  Gaussian 
which  is  of  special  interest  in  practice  since  many  image  and 
video  sources  can  be  modeled  as  jointly  Gaussian. 

Since  we  are  introducing  distortion  to  the  source  with 
Wyner-Ziv  coding,  quantization  is  needed  in  source  coding. 
Usually  there  is  still  certain  correlation  in  the  quantized 
version  of  X  and  the  side  information  Y,  thus  Slepian-Wolf 
coding  could  be  employed  to  reduce  the  rate.  In  this  case,  the 
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Fig.  5.  Block  diagram  of  a  generic  Wyner-Ziv  coder 


Fig.  6.  Noisy  observations  of  acoustic  signal  strength  from  four  distributed 
sensors.  The  four  sensors  are  not  in  equal  distance  to  the  acoustic  source  in 
network. 


III.  Intuition  behind  approch 

In  the  above  section,  we  discussed  lossless  (Slepian-Wolf) 
and  lossy  (Wyner-Ziv)  source  coding  with  side  information 
available  only  at  the  decoder.  Most  of  the  work  in  DSC  so 
far  has  been  focusing  on  the  two  problems.  In  wireless  sensor 
network,  employing  current  DSC  schemes  requires  the  sensor 
nodes  transmitting  correlated  information  to  cooperate  in  a 
small  group  so  that  one  node  provides  side  information  and 
others  compress  the  information  down  to  the  Slepian-Wolf  or 
the  Wyner-Ziv  limit. 

Tha  major  concern  for  practical  application  of  DSC  is  the 
correlation  model.  Theoretically,  two  correlated  non-binary 
sources  can  be  constructed  easily.  An  example  with  uniform 
distribution  is  shown  as  follows: 

•  Let  X  =  XqXi...  and  Y  =  YqY\...  be  two  correlated 
non-binary  sequences  taking  values  in  [L,R\. 

•  Generate  the  i.i.d  sequence  X  using  the  probability 
distribution  P(Xk  =  i)  =  1  /{R  -  L)  where  i  €  [ L ,  /?]. 

•  Define  the  sequence  Y  from  the  sequence  X  using  the 
conditional  probability  distribution  P(Yk  =  j\Xk  =  i)  = 
Pij,  where  i,j  £  [ L ,  i£].  The  joint  probability  distribution 
between  sources  will  be  denoted  by  P(Xk  =  i,Yi  = 
j)  =  Pij/(R  -  L). 

Although  significant  efforts  have  been  put  in  DSC  design  for 
various  correlation  models,  in  real  sensor  network  there  still 
exist  many  situations  that  is  hard  to  come  up  with  certain  joint 
probability  functions.  For  instance,  the  correlation  statistics  of 
the  video  surveillance  networks  can  be  mainly  a  function  of 
the  sensors’  location.  Fig  6  is  another  example  of  the  noisy 
versions  of  the  acoustic  signal  strength  collected  using  the 
Xbow  wireless  sensor  network  professional  developer’s  kit 
MOTE-Kit. 

In  this  paper,  we  address  the  issue  of  lossy  coding  for 
correlated  non-binary  sources  in  the  Xbow  wireless  sensor 


networks.  We  are  interested  in  the  measurement  noise  in 
wireless  sensor  network  specifically  in  the  Chief  Executive 
Officer  (CEO)  problem  [9].  In  this  particular  application,  for 
example,  the  CEO  of  a  company  employs  a  number  of  agents 
to  observe  an  event  and  each  of  the  agents  provides  the  CEO 
with  his/her  noise  version  the  event.  The  agent  are  not  allowed 
to  convene,  and  the  goal  of  the  CEO  is  to  recover  as  much 
information  as  possible  about  the  actual  event  from  the  noisy 
observations  received  from  the  agents,  while  minimizing  the 
total  information  rate  from  the  agents.  The  CEO  problem  can 
then  illuminate  the  measurement  noise  at  the  sensor  node. 

Preliminary  practical  code  constructions  for  the  CEO  prob¬ 
lem  appeared  in  [10],  [11],  based  on  the  Wyner-Ziv  coding 
approaches,  but  they  are  only  limited  to  special  cases.  Fig  7  is 
a  CEO  example  in  wireless  sensor  network  where  the  central 
hub  node  is  responsible  to  recover  the  information  from  the 
noisy  measurements. 


Fig.  7.  A  CEO  example  of  sensor  network.  The  central  hub  node  broadcasts 
the  queries  and  collects  the  noisy  observations  from  the  sensors. 


IV.  Construct  Codeword  Cosets 


table  i 

Results  from  8-level  Lloyd-Max  Quantization 


For  correlated  binary  sources  X  and  Y,  Y  is  a  noise 
corrupted  version  of  X  as  Y  =  X  +  N,  where  N  is  an 
additive  Gaussian  noise.  The  correlation  between  the  interested 
output  X  and  the  side  information  Y  can  be  modeled  with  a 
’’virtual”  correlation  channel,  then  a  good  channel  code  over 
this  channel  can  provide  us  with  a  good  Slepian-Wolf  codes.  In 
a  sense,  the  seemingly  source  coding  problem  of  Slepian-Wolf 
coding  can  be  considered  as  a  channel  coding  problem. 

In  this  section,  we  detail  our  spectrum  efficient  coding 
scheme  for  correlated  non-binary  sources  in  wireless  sensor 
networks.  For  interested  information  X,  the  encoder  side  con¬ 
sists  of  two  parts:  source  encoder  and  coset  encoder.  We  apply 
Lloyd-Max  quantization  in  souce  encoder  which  conducts 
the  design  of  the  initial  codebook.  The  non-binary  sources 
are  then  represented  by  the  binary  codewords  according  to 
the  quantization  levels.  A  coset  encoder  is  constructed  to 
save  transmitting  bits  over  channels.  A  n-bit  codeword  is 
transmitted  by  a  m-bit  (m  <  n)  coset  leader  to  achieve  a 
compression  ratio  of  n  :  m  after  the  coset  encoder.  Side 
information  Y  will  be  transmitted  at  full  rate,  i.e.  not  through 
the  coset  encoder.  The  block  diagram  of  our  coding  scheme 
is  illustrated  in  Fig  8. 


Fig.  8.  Block  diagram  of  the  asymmetric  coding  scheme  for  correlated  non¬ 
binary  sources 

We  next  give  an  example  of  constructing  the  coset  encoder. 


Example  1:  Construct  Codeword  Cosets  with  Hamming 
Distance  dn  =  3 

For  8-level  Lloyd-Max  quantization,  the  input  to  the 
coset  encoder  is  a  3-bit  binary  codeword  Xq  € 
[000,001,011,010,110,111,101,100],  Assuming  the  Ham¬ 
ming  distance  between  Xq  and  the  quantized  binary  side 
information  Yq  is  dH(XQ,Yq)  <  1,  the  cosets  for  Xq  can 
be  constructed  using  the  parity-check  matrix  H 


H  = 


1  1  0 
1  0  1 


(1) 


Four  coset  sets  are  constructed  as  Cl  =  [000,  111],  C 2  = 
[001,110],  C3  =  [010,101]  and  <74  =  [011,100],  The 
transmitted  coset  leader  Xq  is  associated  with  the  syndrome 
s  =  XqHt  .  Sending  the  2-bit  coset  leader  instead  of  the 
original  3-bit  Xq  achieves  a  compression  ratio  of  3  :  2. 

□ 

Now  let  us  consider  the  noisy  observation  from  sensor  node 
1  (see  Fig  6  (a))  as  the  interested  information  X.  Results  from 
8-level  Lloyd-Max  quantization  are  presented  in  Table  I. 


Codebook 

Occurring  Probability 

Binary  Codebook 

498.09 

0.4923 

000 

500.37 

0.2809 

001 

503.06 

0.1590 

Oil 

507.3 

0.0457 

010 

511.31 

0.0136 

110 

515.26 

0.0051 

111 

523 

0.0027 

101 

544 

0.0008 

100 

From  Table  I,  the  first  codeword  after  quantization  498.09 
occurs  at  a  dominant  probability  of  49.23%.  The  probability  of 
occurence  decreases  dramatically  along  the  initial  codebook. 
We  assign  the  binary  codewords  such  that  along  the  probability 
decreasing,  every  adjacent  codeword  differs  in  only  1  bit. 

Suppose  sensor  node  3  (see  Fig  6  (c))  is  transmitting  the 
side  information  Y  for  decoding.  Data  from  sensor  node  3 
is  quantized  separately  using  Lloyd-Max  quantizer.  Now  we 
have  Xq  and  Yq  at  3-bit  correlated  binary  codewords.  Perfect 
coset  encoder  [1]  requires  that  Xq  and  Yq  are  correlated  in 
the  way  that  the  Hamming  distance  between  Xq  and  Yq  is 
no  more  than  one.  Then  the  cosets  for  Xq  are  constructed 
that  the  elements  within  each  coset  have  maximal  Hamming 
distance  dn  =  3  as  depicted  in  example  1. 

In  our  work,  the  correlation  between  Xq  and  Yq  is  un¬ 
known  or  can  hardly  reach  the  perfect  correlation.  But  with 
the  knowledge  of  the  codewords  probability  distribution,  the 
coset  construction  could  be  done  in  a  different  way. 

We  propose  to  design  the  coset  sets  minimizing  the  overall 
cross  ratio.  We  define  the  cross  ratio  as  the  ratio  that  within 
one  coset,  the  codeword  with  less  occuring  probability  will 
cross  the  other.  We  intend  to  decrease  the  decoding  failure  by 
reducing  the  cross  ratio  while  keeping  the  Hamming  distance 
within  each  coset  as  large  as  possible.  Table  II  gives  the  cross 
ratio  of  two  different  coset  sets. 

TABLE  II 

Collision  Ratio  of  Two  Cosets  Sets 


|  Coset  Set  1 

Cross  Ratio 

Coset  Set  2 

Cross  Ratio 

■BliliHWHi 

0.01 

(000,110) 

0.027 

■tlliUlMiM 

0.046 

■EjHHM 

0.018 

(Oil, 100) 

0.01 

(011,101) 

0.017 

(010, 101) 

0.056 

0.017 

Overall 

Overall 

0.01975 

From  Table  II,  we  see  that  coset  set  2  has  less  cross  ratio 
even  though  the  Hamming  distance  within  each  coset  is  dn  = 
2  but  not  3. 

The  parity-check  matrix  to  construct  the  codeword  coset  2 
with  Hamming  distance  dn  =  2  is  shown  as  bellow: 


H  = 


0  0  1 
1  1  0 


(2) 


At  the  decoder,  we  use  the  side  information  Yq  to  look 
for  the  most-likely  codeword  from  the  coset  represented  by 


the  transmitted  coset  leader.  The  decoder  then  get  the  optimal 
estimation  of  X  using  all  received  information. 

V.  Simulation  Results 

Our  simulations  are  performed  over  the  acoustic  noisy 
observations  from  the  Xbow  wireless  sensor  network  pro¬ 
fessional  developer’s  kit  MOTE-Kit.  We  collected  8  sets  of 
acoustic  noisy  version  from  8  distributed  deployed  sensors 
in  a  lab.  Information  from  the  sensor  closest  to  the  acoustic 
source  is  set  as  side  information  for  decoding.  All  others  are 
encoded  separately  and  reconstructed  at  the  decoder  with  the 
side  information.  The  correlation-in-dB  between  the  interested 
information  and  the  side  information  is  presented  in  Table  III. 


TABLE  III 

Correlation-in-dB  BETWEEN  X  AND  Y 


Sensor  Node 

Correlation-in-dB 

1 

5.7988 

2 

5.0864 

3 

6.3903 

4 

4.2262 

5 

4.2343 

6 

4.1522 

7 

5.5238 

Due  to  the  packet  loss  in  data  collecting  at  the  central  hub 
node,  the  correlation  between  the  interested  information  X  and 
the  side  information  Y  from  sensor  node  8  is  pretty  low.  We 
choose  two  sensor  nodes  (node  1  and  node3)  with  the  highest 
correlation  to  side  information  Y  for  our  simulation. 

For  comparison,  we  generate  two  ideal  i.i.d  Gaussian  se¬ 
quences  X  and  Y  correlated  by  Y  =  X  +  N,  where  X  has 
zero  mean  and  unit  variance  and  N  is  the  zero  mean  Guassian 
noise  with  variance  a^.Y,  the  corrupted  version  of  X  is  the 
side  information  used  for  joint  decoding. 


Fig.  9.  Probability  of  Error  for  R=2bits/sample,  Lloyd-Max  quantization  and 
coset  encoder. 


Fig.  10.  Normalized  Distortion  for  R=2bits/sample,  Lloyd-Max  quantization 
and  coset  encoder. 

We  employ  8-level,  16-level  and  32-level  Lloyd-Max  quan¬ 
tization.  Each  is  partitioned  into  two  cosets,  where  each  coset 
set  contains  2,  4  and  8  codewords  respectively.  The  number 
of  samples  used  for  the  Monte  Carlo  simulations  is  10 7 . 
Fig.  9  shows  the  probability  of  decoding  error  for  the  above 
three  schemes  and  normalized  distortion  with  correct  decoding 
only  is  plotted  versus  correlation  SNR  for  the  same  schemes 
in  Fig.  10.  Observe  that  for  a  given  correlation  SNR,  as 
the  number  of  quantization  levels  increases,  the  normalized 
distortion  decreases  and  the  probability  of  decoding  error 
increases.  Ideally  for  a  given  transmission  rate,  we  want  to 
quantize  with  a  large  number  of  levels  to  cut  down  distortion, 
but  the  tradeoff  between  the  distortion  and  probability  of 
decoding  errors  put  a  constraint  in  this.  As  can  be  noted  from 
Fig.  9  and  10,  at  8-level  quantization,  performance  of  sensor 
readings  from  node  1  and  node  3  are  approximate  0.5  dB  from 
the  one  of  ideal  i.i.d  Gaussian  sources. 

In  coset  encoding,  we  compare  different  coset  construction 
methods.  Fig.  1 1  gives  the  result  of  coset  set  1  and  coset  set 
2  at  8-level  quantization.  The  performance  of  coset  set  2  is 
slightly  better  than  the  one  of  coset  set  1 . 

Last  we  employ  our  coding  scheme  to  all  7  sensor  nodes 
and  compute  the  actual  transmitted  data  bits.  We  process  the 
information  observed  in  the  common  epoch  and  discard  the 
incomplete  observations.  Under  the  scheme  of  two  coset  sets, 
the  real  transmitted  data  bits  remain  the  same  for  8-,  16-  and 
32-level  cases.  Results  are  presented  in  Table  IV. 


TABLE  IV 

Compression  Ratio  of  real  transmitted  data 


Levels 

Original  bits 

Transmitted  bits 

Compression  Ratio 

8-level 

78339 

31815 

2.46 

1 6-level 

104532 

31815 

3.29 

32-level 

130665 

31815 

4.11 

[6]  D.  Slepian  and  J.  K.  Wolf  “Noiseless  Coding  of  Correlated  Information 
Sources”  Proceedings  of  the  IEEE,  vol.  83,  no.  3,  pp.  345  -  377,1995 

[7]  A.  D.  Wyner  and  J.  Ziv  “The  rate-distortion  function  for  source  coding 
with  side  information  at  the  decoder”  IEEE  Trans  on  Information  Theory, 
vol.  83,  no.  3,  pp.  345  -  377,1995 

[8]  T.  Cover  and  J.  Thomas  “Elements  of  Information  Theory”,  New  York, 
Wiley,  1997 

[9]  T.  Berger,  Z.  Zhang  and  H.  Viswanathan  “The  CEO  problem  [multitermi¬ 
nal  source  coding]”  IEEE  Trans.  Inform.  Theory,  vol.42, pp.887-902, May 
1996 

[10]  S.  Pradhan  and  K.  Ramchandran  “Generalized  coset  codes  for  symmetric 
distributed  source  coding”  submitted  to  IEEE  Trans.  Inform.  Theory, 
Feb, 2003 
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for  remote  multiterminal  source  coding”  Proc.  DCC’04,  Snowbird,  UT, 
Mar.2004 


Fig.  11.  Probability  of  Error  for  R=2bits/sample,  Lloyd-Max  quantization 
and  coset  encoder. 


VI.  Conclusions 

In  this  paper,  we  have  proposed  a  spectrum  efficient  coding 
scheme  for  correlated  non-binary  sources  in  sensor  networks. 
Instead  of  using  theoreticaly  ideal  data,  our  scheme  is  based 
on  the  statistic  characters  of  the  correlated  non-binary  sources 
from  real  sensor  network.  The  coset  construction  introduced 
in  this  paper  leverages  the  inherent  correlations  between 
sensor  observations,  but  more  importantly  by  minimizing  the 
cross  ratio,  decreases  the  probability  of  decoding  error.  The 
proposed  scheme  performs  at  0.5  -  1.5  dB  from  the  Wyner- 
Ziv  distortion  bound.  We  believe  our  approach  provides  a 
practical  solution  to  distributedly  compress  the  acoustic  sensor 
observations  and  can  be  extended  to  the  CEO  problem.  Our 
future  work  will  concentrate  on  spectrum  efficient  coding  for 
distributed  sources  with  memory  which  is  rarely  studies  so  far. 
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Abstract — A  wireless  sensor  network  (WSN)  is  designed  to 
perform  various  information  processing  tasks  such  as  event 
detection,  target  tracking  and  data  classification.  Comparing  with 
traditional  centralized  networks,  networked  sensing  offers  unique 
advantage  in  improved  robustness  and  scalability.  Measures  of 
performance  for  these  tasks  are  well  defined,  including  detection 
of  false  alarms  or  misses,  classification  errors,  and  track  quality. 
In  this  paper,  we  present  a  fundamental  performance  analysis 
of  event  detection  in  wireless  sensor  networks.  Our  performance 
analysis  is  based  on  a  new  detection  scheme  -  double  sliding 
window  (DSW)  even  detection.  We  compare  it  theoretically 
against  the  fixed  threshold  approach. 

I.  Introduction 

Research  on  sensor  networks  was  originally  motivated 
by  military  applications.  Starting  around  1980,  networked 
microsensors  technology  has  been  widely  used  in  military 
applications.  One  example  of  such  applications  is  the  Co¬ 
operative  Engagement  Capability  (CEC)  developed  by  the 
U.S.Navy.  This  network-centric  warfare  consists  of  multiple 
radars  collecting  data  on  air  targets  [1].  Other  military  sensor 
networks  include  acoustic  sensor  arrays  for  antisubmarine 
warfare  such  as  the  Fixed  Distributed  System  (FDS)  and  the 
Advanced  Deployable  System  (ADS),  and  unattended  ground 
sensors  (UGS)  such  as  the  Remote  Battlefield  Sensor  System 
(REMBASS)  and  the  Tactical  Remote  Sensor  System  (TRSS). 

Nowadays  small  and  inexpensive  sensors  based  upon  mi¬ 
croelectromechanical  system  (MEMS)  [2]  technology,  wireless 
networking,  and  inexpensive  low-power  processors  allow  the 
deployment  of  wireless  sensor  networks  for  various  non¬ 
military  applications,  from  environment  and  habitat  monitor¬ 
ing,  to  industrial  process  control,  to  infrastructure  security  [3] 
and  automation  in  the  transportation. 

A  wireless  sensor  network  (WSN)  consists  of  certain 
amount  of  small  and  energy  constrained  nodes.  Basic  com¬ 
ponents  of  sensor  node  include  a  single  or  multiple  sensor 
modules,  a  wireless  transmitter-receiver  module,  a  computa¬ 
tional  module  and  a  power  supply  module.  Such  networks  are 
normally  deployed  for  data  collection  where  human  interven¬ 
tion  after  deployment,  to  recharge  or  replace  node  batteries 
may  not  be  feasible.  Therefore,  energy  constraint  becomes  a 
unique  character  of  WSN  comparing  to  traditional  wireless 
ad-hoc  networks.  According  to  [4],  energy  comsuption  occurs 
in  three  domains:  sensing,  data  processing  (including  AD/DA 


and  digital  signal  processing),  and  communications.  [5]  dis¬ 
covered  that  the  sensor,  signal  processing  parts  operate  at  low 
frequency  and  consume  less  than  1  mW.  This  is  over  an  order 
of  magnitude  less  than  the  energy  consumption  of  the  commu¬ 
nication  part.  Therefore,  we  prefer  less  communication/data 
exchange  between  sensor  nodes  but  more  local  processing 
implemented  by  one  single  sensor  node  so  as  to  increase  the 
lifetime  of  the  WSN. 

The  main  goal  of  wireless  sensor  networks  is  to  monitor 
physical  world.  In  most  of  the  time,  no  event  happens  in  the 
sensed  field  or  surveillance  zone.  So  the  sensed  data  are  not 
necessarily  to  be  stored  for  a  long  time  or  be  transmitted  to 
the  gateway.  Usually,  people  are  more  interested  in  unexpected 
events.  For  example,  in  a  scenario  of  battlefield,  people  are 
more  interested  in  the  appearance  of  enemies.  If  a  wireless 
sensor  network  is  to  monitor  forest-fire,  unusual  increasing 
of  the  temperature  should  be  a  necessary  warning  to  people. 
Both  the  appearance  of  enemies  and  the  unusual  increasing 
of  the  temperature  can  be  seen  as  events.  Because  of  the 
energy,  storage,  and  memory  constraints  of  wireless  sensor 
networks,  the  ideal  state  of  wireless  sensor  networks  should 
be  event-driven,  so  that  the  RF  communication  circuits  can 
power  off  at  most  of  the  time.  Only  when  certain  sensor  nodes 
detect  an  event,  they  trigger  the  RF  channel,  and  transmit 
the  useful  information  to  gateway  or  headquarters.  Therefore, 
event-detection  is  one  of  the  key  issues  for  wireless  sensor 
networks,  and  it’s  a  very  efficient  way  of  self-managing,  which 
helps  to  release  the  memory  and  storage  constraint  and  energy 
constraint. 

Performance  of  wireless  sensor  network  applications  is 
measured  in  several  ways  including  detection  of  false  alarms  or 
misses,  classification  errors,  and  track  quality.  In  this  paper,  we 
present  a  fundamental  performance  analysis  of  event  detection 
in  wireless  sensor  networks.  We  introduce  a  new  scheme  of 
event  detection  for  WSN  -  double  sliding  window  (DSW) 
event  detection  and  analyze  the  fundamental  performance:  the 
probability  of  detection  and  the  probability  of  false  alarm  over 
this  new  detection  scheme. 

The  rest  of  this  paper  is  organized  as  follows.  Section  II 
introduce  a  common  type  of  sensors  for  tracking:  acoustic 
amplitude  sensor  model.  Double  sliding  window  event  detec¬ 
tion  is  described  in  Section  III.  In  Section  IV  we  detail  the 


fundamental  performance  analysis  over  the  proposed  detection 
scheme.  Section  V  concludes  this  paper. 

II.  Acoustic  Amplitude  Sensor  Model 

Localizing  and  tracking  moving  objects  is  an  essential 
capability  for  a  sensor  network  in  many  practical  applications. 
While  another  class  of  sensor  network  applications  concerns 
with  the  problem  of  sensing/detecting  a  field.  Although  they 
may  seem  quite  different  from  each  other,  both  require  col¬ 
laborative  processing  among  sensor  nodes  along  the  temporal 
dimension  as  well  as  in  the  spatial  domain  [6],  In  the  field 
sensing  case,  the  collaboration  among  sensors  primarily  occurs 
in  the  spatial  domain  and  occasionally  along  the  temporal 
dimension  when  the  field  evolves  over  time.  In  our  study,  we 
focus  on  on  the  field  sensing/detecting  problem. 

A.  Notation  and  Assumptions 

We  use  the  following  notation  in  our  formulation  of  the 
sensing/detecting  problem  in  a  sensor  network: 

•  Superscript  t  denotes  time.  We  consider  discrete  times  t 
that  are  nonnegative  integers. 

•  Subscript  i  6  [l,...,Rr]  denotes  the  sensor  index;  K  is 
the  total  number  of  sensors  in  the  network. 

•  Subscript  j  e  [1, N]  denotes  the  target  index;  N  is  the 
total  number  of  targets  being  observed. 

•  The  target  state  at  time  t  is  denoted  as  x*.  For  a  multi¬ 
target  sensing/detecting  problem,  this  is  a  concatenation 
of  individual  target  states  x*. 

•  The  measurement  of  sensor  i  at  time  t  is  denoted  as  z\. 

•  The  measurement  history  up  to  time  t  is  denoted  as 
zW  =  {z^0), zW,  ...,zW}.  The  measurements  may  orig¬ 
inate  from  a  single  sensor  or  a  set  of  sensors. 

•  The  collection  of  all  sensor  measurements  at  time  t  are 
denoted  as  z ^  =  j z^, z^, . . . , z j . 

In  this  paper,  we  consider  a  single  sound  source  as  the 
target  ( N  =  1)  and  the  target  state  x*  is  the  location  of  the 
target  in  a  two-dimensinal  plane.  Each  sensor  measures  the 
received  signal  strength  reflected  from  the  target.  We  make  the 
assumption  that  the  sensor  characteristics  are  time-invariant 
and  the  target  locates  in  a  fixed  position. 

B.  Sensing  Model 

The  time-dependent  measurement  z^  of  sensor  i  with 
characteristics  xf'1  is  related  to  the  target  state  x^  through 
the  following  observation  model, 

zf)=h(XW,Af))  (1) 

where  h  is  a  function  depending  on  x  W  and  parameterized 
by  which  represents  our  knowledge  about  sensor  i. 
In  our  study,  we  consider  the  sensing  model  for  a  single 
target  with  x  representing  the  location  of  the  target.  Typical 
characteristics  A^  about  sensor  i  include  sensing  modality 
(e.g.  what  kind  of  sensor  i  is),  sensor  position  and  other 
parameters,  such  as  the  noise  model  of  sensor  i.  Normally,  the 


sensor  characteristics  are  relatively  stable  comparing  with  the 
more  dynamic  measurements. 

Eq  (1)  is  a  general  form  of  the  observation  model  that 
accounts  for  possibly  nonlinear  relations  between  the  sensor 
type,  sensor  position,  noise  model  etc.  A  special  case  of  (1) 
would  be 


h(x«,  A®)  =fi(x«,Af))+w*  (2) 

where  f,  is  a  observation  function,  and  w»  is  additive,  zero 
mean  noise  with  known  covariance. 

In  order  to  illustrate  the  idea,  we  consider  the  problem 
of  stationary  target  localization  with  time-invariant  sensor 
characteristics.  In  this  paper,  we  assume  that  all  sensors  are 
acoustic  sensors  measuring  only  the  amplitude  of  the  received 
sound  signal  so  that  the  state  parameter  x  is  the  unknown 
target  position.  Note  that  under  our  assumption,  there  is  no 
longer  a  time  dependence  for  x  and  A  j.  Assuming  that  acoustic 
signals  propagate  isotropically,  the  parameters  are  related  to 
the  measurements  by 


CLi 


Zi  = 


II*  -  Ci 


+  Wi 


(3) 


where  a*  is  a  given  random  variable  representing  the  am¬ 
plitude  of  the  signal  at  the  target,  a  is  a  known  attenuation 
coefficient,  and  ||  j[  is  the  Euclidean  norm.  The  term  Wi  is  a 
zero  mean  Gaussian  random  variable  with  variance  of. 


C.  Acoustic  Amplitude  Sensor 

There  are  two  common  types  of  sensors  for  detecting  and 
tracking:  acoustic  amplitude  sensors  and  direction-of-arrival 
(DOA)  sensors.  In  this  section,  we  detail  the  characteristics  of 
the  acoustic  amplitude  sensors. 

An  acoustic  amplitude  sensor  node  measures  sound  ampli¬ 
tude  at  the  microphone  and  estimates  the  distance  to  the  target 
based  on  the  physics  of  sound  attenuation.  Generally,  range 
sensors  estimate  distance  based  on  received  signal  strength  or 
time  difference  of  arrival  (TDOA). 

Assuming  that  the  sound  source  is  a  point  source  and  sound 
propagation  is  lossless  and  isotropic,  a  root-mean-squared 
(RMS)  amplitude  measurement  Z  is  related  to  the  sound  source 
position  x  as 


z  = 


ll*-CII 


+  w 


(4) 


where  a  is  the  RMS  amplitude  of  the  sound  source,  £  is  the 
location  of  the  sensor,  and  w  is  RMS  measurement  noise  [7]. 
This  is  a  special  case  of  (3).  For  simplicity,  we  model  w  as  a 
Gaussian  with  zero  mean  and  variance  a2. 


III.  Double  Slide  Window  Event  Detection 

The  ability  of  a  sensor  receiver  to  detect  a  weak  echo  signal 
is  limited  by  the  noise  that  occupies  the  same  part  of  the 
frequency  spectrum  as  the  signal.  Detection  of  an  acoustic 
signal  is  based  on  establishing  a  threshold  at  the  output  of  the 
receiver.  If  the  receiver  output  exceeds  the  threshold,  a  target 


is  said  to  be  present.  This  is  called  threshold  detection.  Fig.  1 
represents  the  output  of  an  acoustic  receiver  as  a  function  of 
time.  The  fluctuating  appearance  of  the  output  is  due  to  the 
random  nature  of  receiver  noise. 

A  threshold  level  in  Fig.  1  is  shown  by  the  long  dashed 
line.  If  the  signal  is  large  enough,  as  at  A  and  B,  a  target 
is  reported  to  be  present,  but  C  is  a  missed  detection  at  the 
given  threshold.  The  signal  at  C  would  have  been  detected 
if  the  threshold  were  lower.  But  too  low  threshold  increases 
the  likelihood  that  noise  alone  will  exceed  the  threshold  and 
causes  false  alarm. 


Fig.  1.  Envelope  of  the  radar  receiver  output  as  a  function  of  time.  A,  B 
and  C  represent  signal  plus  noise.  A  and  B  would  be  valid  detections,  but 
C  is  a  missed  detection 

In  [8],  the  received  signal  strength  S  from  acoustic  sensors 
in  a  fixed  period  of  time  is  integrated,  when  it  exceeds  a 
threshold,  the  authors  claim  a  detection  of  event  occurred  as: 

M-l 

S  =£>„-,  I2  (5) 

1=0 

E  —  E threshold  (6) 

where  2  denotes  the  measurement  of  received  signal  strength 
at  each  sampling  point.  M  is  the  length  of  observing/sampling 
window. 

However,  this  simple  method  suffers  from  a  significant 
drawback;  namely,  the  value  of  the  threshold  depends  on  the 
sensed  signal  energy.  When  there  is  no  event  occuring  in  the 
sensing  range,  the  sensed  signal  consists  of  only  noise.  The 
level  of  the  noise  power  is  generally  unknown  and  can  change 
when  the  environment  changes  or  if  unwanted  interferers  go 
on  and  off.  Therefore,  it  is  quite  difficult  to  set  a  fixed 
threshold.  We  design  a  double  sliding  window  algorithm  for 
event-detection  so  as  to  alleviate  the  threshold  value  selection 
problem. 

The  double  sliding  window  event-detection  algorithm  cal¬ 
culates  two  consecutive  sliding  windows  of  the  sensed  signal 
energy.  The  basic  principle  is  to  form  the  decision  variable  as 
the  ratio  of  the  total  energy  contained  inside  the  two  windows. 
Fig.  2  shows  two  consecutive  windows  A  and  B  (note  that 
window  B  arrives  after  window  A)  and  the  response  of  the 
ratio  Rs  to  a  sensed  event.  It  can  be  seen  that  when  only 
noise  is  sensed  the  response  is  nearly  flat,  since  both  windows 
contain  ideally  the  same  amount  of  noise  energy. 


£ 


Fig.  2.  Illustration  of  double  sliding  window  event  detection,  A  and  B  are 
two  continuous  sampling  windows  with  same  length  and  B  arrives  after  A. 

The  calculation  of  the  window  A  and  window  B  value  is 
shown  as 

M- 1 

S0  =  £  |zn_;|\  (7) 

1=0 

M-l 

Sb  -  ^2  \Zn+M—l |2  •  (8) 

1=0 

Then  the  decision  variable  Rs  is 

Rs  =  !*.  (9) 

The  advantage  of  this  approach  is  the  decision  variable  Rs 
does  not  depend  on  the  sensed  signal  energy,  but  on  the  ratio 
of  the  energy  of  two  consecutive  windows. 

IV.  Fundamental  Performance  Analysis 
A.  Fixed  Threshold  Event  Detection 

In  the  acoustic  sensor,  the  sensed  signal  needs  to  pass  an 
IF  filter  after  the  A/D  converter.  If  there  is  no  event  (only 
noise  exists)  in  the  observed  sampling  window,  the  noise  to  the 
sensor  at  the  input  to  the  IF  filter  can  be  described  by  Gaussian 
probability  density  function  (pdf)  with  mean  value  of  zero  and 
variance  ipo-  Rice  [9]  has  shown  that  when  Gaussian  noise  is 
passed  through  the  IF  filter,  the  pdf  of  the  noise  envelop  R 
follows  Rayleigh  distribution: 


The  probability  that  the  envelop  of  the  noise  will  exceed 
the  fixed  threshold  Vp  is, 


p/-'/vTsexp('€)<iK““p(^)  (n) 

which  is  the  probability  of  false  alarm  rate. 

If  there  is  event  occurring,  the  pdf  of  the  sensed  signal  to 
the  sensor  at  the  input  of  IF  filter  is  Gaussian  pdf  with  mean 
value  of  m  and  variance  ipo.  The  pdf  of  the  envelop  R  of  the 
sensed  signal  passing  the  IF  filter  has  a  Rice  distribution  [9]: 


R  (  R2  +  m‘ 
ipo  CXP  1  2V>o 
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Pd  =  ps(R)dR 

JVn 


Window  A  -  Noise 
Rayleigh  Distribution 


Window  B  -  Event 
Rice  Distribution 


where  Iq(z)  is  the  zero-order  modified  Bessel  function.  The 
probability  of  detection  Pd  is  the  probability  that  the  envelope 
R  will  exceed  the  threshold  Vo- 


which  is  the  probability  of  detection. 

We  are  interested  to  know  the  optimal  value  of  threshold 
Vd ■  To  get  Vd,  we  use  maximum  a  posterior  (MAP)  detection. 
The  decision  boundary  is 

wm  =  p{h i) 

mm  pm 

where  p(Hx )  is  the  probability  of  no  events  and  p(H2)  is  the 
probablity  of  events  happening  in  one  observation.  Assume  we 
have  the  knowledge  of  p(H j)  and  p(H2),  optimal  threshold 
Vd  can  be  derived  from  (14). 

Let  P  =  p(Hi)/p(H2),  applying  (10)  and  (12)  we  get, 

R  ,  R\  0R  ,  R2  +  m\T,Rm , 
AeXP(-2*)='5*eXP(-W^)/"(^)  "5) 

Optimal  threshold  Vo  is  the  solution  of  (15)  and  Vo  can 
be  written  as: 


Fig.  3.  Case  of  Detecting  an  Event  (window  B  arrives  after  window  A) 


1)  Probability  of  Detection:  From  Fig.  3,  in  the  observing 
window  A,  a  zero  mean  and  variance  ip o  Gaussian  noise 
passes  through  the  IF  filter,  the  pdf  of  the  envelope  of  signal 
strength  follows  Rayleigh  distribution  as  in  (10).  In  window 
B,  since  there  is  an  event  occurring,  the  pdf  of  the  envelope 
of  the  sensed  signal  strength  passing  the  IF  filter  has  a  Rice 
distribution  as  in  (12). 

Let  X  =  Sb,Y  =  Sa  and  Z  =  Rs  (referring  to  Section  III). 
We  get  the  pdf  of  decision  variable  Z  —  X/  Y  in  the  following. 

Since  random  variable  X  and  Y  are  identically  independent, 
we  have 


pOO 

fz{z )  =  /  yfx=yz{x  =  yz)fy{ 

J  y=0 


Using  the  pdf  of  observing  window  A  and  B,  we  get, 


y2z 2  +  m2 
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The  pdf  of  decision  variable  Z  =  R„  can  be  get  by 
simplifying  (18), 
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B.  Double  Sliding  Window  Event  Detection 

In  double  sliding  window  (DSW)  detection,  decision  is 
made  over  two  consecutive  sampling  windows.  According  to 
the  example  in  Section  III,  an  event  or  false  alarm  is  reported 
when  the  decision  variable  Rs  exceeds  a  given  threshold  So 
(note  So  is  different  from  signal  strength  threshold  Vd  in 
Section  IV-A).  In  the  case  of  two  consecutive  windwos  A  and 
B  (note  that  window  B  arrives  after  window  A),  detecting  an 
event  and  false  alarm  occur  respectively  in  the  following  two 
conditions: 

•  Detecting  an  event  -  Window  A  represents  background 
noise  and  window  B  represents  the  occurring  events. 

•  False  alarm  -  Window  A  and  B  both  represent  backgroud 
noise  but  the  decision  variable  Rs  exceeds  the  threshold. 

We  then  analyze  the  foundamental  performance  -  the  prob¬ 
ability  of  detection  and  the  probability  of  false  alarm  in  the 
DSW  detection  scheme. 


The  probability  of  detection  in  DSW  detection  scheme,  i.e., 
the  probability  that  the  envelop  of  the  decision  variable  Z  = 
Rs  will  exceed  the  given  threshold  So  is, 

r™  r°°  v3z  r 

pd=  dz  z-jy  exp  — 

JSd  Jy= o  n  [ 


Window  A  -  Noise  Window  B  -  Noise 
Rayleigh  Distribution  Rayleigh  Distribution 


Fig.  4.  Case  of  False  Alarm  (window  B  arrives  after  window  A) 

2)  Probability  of  False  Alarm:  When  window  A  and  B 
both  represent  backgroud  noise  as  shown  in  Fig.  4  but  the 
decision  variable  Rs  exceeds  the  threshold  So,  a  false  alarm 
is  reported.  In  this  case,  the  pdf  of  decision  variable  Z  =  Rs 
is  derived  similarly  starting  from  (17). 


y2(  1  +  z2)  +  m2 
2ip0 


Io (~)dy  (20) 


It  follows  that: 


Replacing  y 2  with  s,  (22)  becomes: 


Observe  that  the  probability  of  false  alarm  depends  only 
on  the  noise  variance  and  threshold  level  which  is  reasonable 
<^2/  (21)  since  in  this  case,  no  signal  but  noise  gets  involved. 

Similarly,  we  use  MAP  detection  to  get  optimal  threshold 
Sd  in  DSW  event  detection. 

Assume  j3  =  p(Hi)/p(H2)  (same  as  in  fixed  threshold 
detection),  applying  (19)  and  (27)  we  get: 
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which  is  equivalent  to, 
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Let 

a(z)=L*'d[exp(-!^r!1)}  <24> 

We  next  simplify  U(z)  in  (24)  using  the  method  of  definite 
integral: 

Ja  f(x)d9(x)  =  f(x)g(x) \ba  -  g{x)f\x)dx  (25) 

Comparing  (24)  and  (25),  we  get  f{s)  =  s  and  g(s)  = 
exp(-(l  +  z2)s/( 2ip0).  U{z)  can  be  solved  as  follows: 


Solving  (31)  gives  the  optimal  threshold  SD  in  DSW  event 
detection. 


V.  Conclusions 

Measures  of  performance  for  wireless  sensor  network  ap¬ 
plications  are  .defined  in  various  ways  in  which,  detection 
probability  and  false  alarm  probability,  classification  errors  and 
track  quality  have  been  widely  used.  In  this  paper,  we  studied 
the  performance  of  event  detection  in  WSN.  We  introduced  a 
detection  scheme  -  double  sliding  window  (DSW)  event  detec¬ 
tion  and  analyzed  the  fundamental  performace  -  the  probability 
of  detection  and  the  probability  of  false  alarm  over  this  new 
detection  scheme.  We  believe  that  our  DSW  detection  will 
practically  approach  or  exceed  the  fixed  threshold  detection. 
Simulations  over  Xbow  WSN  professional  developer’s  kit  will 
be  provided  in  the  later  version. 


U(z)  =  s-  [axp(-Lti!s)] 

Substituting  U(z)  in  (23)  with  (26),  we  get  the  pdf  of 
decicion  variable  Z  =  Rs  as: 
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The  probability  of  false  alarm  in  DSW  detection  scheme, 
i.e.,  the  probability  that  the  envelop  of  the  decision  variable 
Z  —  i?swill  exceed  the  given  threshold  So  is, 


P<  =  P  2^Z  ,  _  P  </>0  ,  2 

/a  JSd{ i  +  z*ydz  JSdJTT^) *dz 

Let  z2  =  q,  (28)  can  be  written  as: 
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Abstract — A  wireless  sensor  network  (WSN)  is  designed  to 
perform  various  information  processing  tasks  such  as  event 
detection,  target  tracking  and  data  classification.  Comparing  with 
traditional  centralized  networks,  networked  sensing  offers  unique 
advantage  in  improved  robustness  and  scalability.  Measures  of 
performance  for  these  tasks  are  well  defined,  including  detection 
of  false  alarms  or  misses,  classification  errors,  and  track  quality. 
In  this  paper,  we  present  a  new  algorithm  of  event  detection  in 
wireless  sensor  networks.  Our  performance  analysis  is  based  on 
the  new  detection  scheme  -  double  sliding  window  (DSW)  event 
detection.  We  compare  it  theoretically  against  the  fixed  threshold 
approach  in  terms  of  probability  of  detection  and  false  alarm. 

I.  Introduction 

Research  on  sensor  networks  was  originally  motivated 
by  military  applications.  Starting  around  1980,  networked 
microsensors  technology  has  been  widely  used  in  military 
applications.  One  example  of  such  applications  is  the  Co¬ 
operative  Engagement  Capability  (CEC)  developed  by  the 
U.S.Navy.  This  network-centric  warfare  consists  of  multiple 
radars  collecting  data  on  air  targets  [1],  Other  military  sensor 
networks  include  acoustic  sensor  arrays  for  antisubmarine 
warfare  such  as  the  Fixed  Distributed  System  (FDS)  and  the 
Advanced  Deployable  System  (ADS),  and  unattended  ground 
sensors  (UGS)  such  as  the  Remote  Battlefield  Sensor  System 
(REMBASS)  and  the  Tactical  Remote  Sensor  System  (TRSS). 

Nowadays  small  and  inexpensive  sensors  based  upon  mi¬ 
croelectromechanical  system  (MEMS)  [2]  technology,  wireless 
networking,  and  inexpensive  low-power  processors  allow  the 
deployment  of  wireless  sensor  networks  for  various  non¬ 
military  applications,  from  environment  and  habitat  monitor¬ 
ing,  to  industrial  process  control,  to  infrastructure  security  [3] 
and  automation  in  the  transportation. 

A  wireless  sensor  network  (WSN)  consists  of  certain 
amount  of  small  and  energy  constrained  nodes.  Basic  com¬ 
ponents  of  sensor  node  include  a  single  or  multiple  sensor 
modules,  a  wireless  transmitter-receiver  module,  a  computa¬ 
tional  module  and  a  power  supply  module.  Such  networks  are 
normally  deployed  for  data  collection  where  human  interven¬ 
tion  after  deployment,  to  recharge  or  replace  node  batteries 
may  not  be  feasible.  Therefore,  energy  constraint  becomes  a 
unique  character  of  WSN  comparing  to  traditional  wireless 
ad-hoc  networks.  According  to  [4],  energy  comsuption  occurs 
in  three  domains:  sensing,  data  processing  (including  AD/DA 


and  digital  signal  processing),  and  communications.  [5]  dis¬ 
covered  that  the  sensor,  signal  processing  parts  operate  at  low 
frequency  and  consume  less  than  lmW.  This  is  over  an  order 
of  magnitude  less  than  the  energy  consumption  of  the  commu¬ 
nication  part.  Therefore,  we  prefer  less  communication/data 
exchange  between  sensor  nodes  but  more  local  processing 
implemented  by  one  single  sensor  node  so  as  to  increase  the 
lifetime  of  the  WSN. 

The  main  goal  of  wireless  sensor  networks  is  to  monitor 
physical  world.  In  most  of  the  time,  no  event  happens  in  the 
sensed  field  or  surveillance  zone.  So  the  sensed  data  are  not 
necessarily  to  be  stored  for  a  long  time  or  be  transmitted  to 
the  gateway.  Usually,  people  are  more  interested  in  unexpected 
events.  For  example,  in  a  scenario  of  battlefield,  people  are 
more  interested  in  the  appearance  of  enemies.  If  a  wireless 
sensor  network  is  to  monitor  forest-fire,  unusual  increasing 
of  the  temperature  should  be  a  necessary  warning  to  people. 
Both  the  appearance  of  enemies  and  the  unusual  increasing 
of  the  temperature  can  be  seen  as  events.  Because  of  the 
energy,  storage,  and  memory  constraints  of  wireless  sensor 
networks,  the  ideal  state  of  wireless  sensor  networks  should 
be  event-driven,  so  that  the  RF  communication  circuits  can 
power  off  at  most  of  the  time.  Only  when  certain  sensor  nodes 
detect  an  event,  they  trigger  the  RF  channel,  and  transmit 
the  useful  information  to  gateway  or  headquarters.  Therefore, 
event-detection  is  one  of  the  key  issues  for  wireless  sensor 
networks,  and  it’s  a  very  efficient  way  of  self-managing,  which 
helps  to  release  the  memory  and  storage  constraint  and  energy 
constraint. 

Performance  of  wireless  sensor  network  applications  is 
measured  in  several  ways  including  detection  of  false  alarms  or 
misses,  classification  errors,  and  track  quality.  In  this  paper,  we 
present  a  fundamental  performance  analysis  of  event  detection 
in  wireless  sensor  networks.  We  introduce  a  new  scheme  of 
event  detection  for  WSN  -  double  sliding  window  (DSW) 
event  detection  and  analyze  the  fundamental  performance:  the 
probability  of  detection  and  the  probability  of  false  alarm  over 
this  new  detection  scheme. 

The  rest  of  this  paper  is  organized  as  follows.  Section  II 
introduce  a  common  type  of  sensors  for  tracking:  acoustic 
amplitude  sensor  model.  Double  sliding  window  event  detec¬ 
tion  is  described  in  Section  III.  In  Section  IV  we  detail  the 


fundamental  performance  analysis  over  the  proposed  detection 
scheme.  Section  V  concludes  this  paper. 

II.  Acoustic  Amplitude  Sensor  Model 

Localizing  and  tracking  moving  objects  is  an  essential 
capability  for  a  sensor  network  in  many  practical  applications. 
While  another  class  of  sensor  network  applications  concerns 
with  the  problem  of  sensing/detecting  a  field.  Although  they 
may  seem  quite  different  from  each  other,  both  require  col¬ 
laborative  processing  among  sensor  nodes  along  the  temporal 
dimension  as  well  as  in  the  spatial  domain  [6],  In  the  field 
sensing  case,  the  collaboration  among  sensors  primarily  occurs 
in  the  spatial  domain  and  occasionally  along  the  temporal 
dimension  when  the  field  evolves  over  time.  In  our  study,  we 
focus  on  on  the  field  sensing/detecting  problem. 

A.  Notation  and  Assumptions 

We  use  the  following  notation  in  our  formulation  of  the 
sensing/detecting  problem  in  a  sensor  network: 

•  Superscript  t  denotes  time.  We  consider  discrete  times  t 
that  are  nonnegative  integers. 

•  Subscript  i  €  [1  denotes  the  sensor  index;  K  is 

the  total  number  of  sensors  in  the  network. 

•  Subscript  j  €  [1, N]  denotes  the  target  index;  N  is  the 
total  number  of  targets  being  observed. 

•  The  target  state  at  time  t  is  denoted  as  x*.  For  a  multi¬ 
target  sensing/detecting  problem,  this  is  a  concatenation 
of  individual  target  states  xj. 

•  The  measurement  of  sensor  i  at  time  t  is  denoted  as  z\. 

•  The  measurement  history  up  to  time  t  is  denoted  as 
z(*)  =  {z^.z^,  ...,zW}.  The  measurements  may  orig¬ 
inate  from  a  single  sensor  or  a  set  of  sensors. 

•  The  collection  of  all  sensor  measurements  at  time  t  are 
denoted  as  z^l  =  jzf^z^,  ...,z^|. 

In  this  paper,  we  consider  a  single  sound  source  as  the 
target  ( N  =  1)  and  the  target  state  x<  is  the  location  of  the 
target  in  a  two-dimensinal  plane.  Each  sensor  measures  the 
received  signal  strength  reflected  from  the  target.  We  make  the 
assumption  that  the  sensor  characteristics  are  time-invariant 
and  the  target  locates  in  a  fixed  position. 

B.  Sensing  Model 

The  time-dependent  measurement  zf'1  of  sensor  i  with 
characteristics  A*  is  related  to  the  target  state  iW  through 
the  following  observation  model, 

z«  =h(xW  At(t))  (1) 

where  h  is  a  function  depending  on  and  parameterized 
by  A^,  which  represents  our  knowledge  about  sensor  i. 
In  our  study,  we  consider  the  sensing  model  for  a  single 
target  with  x  representing  the  location  of  the  target.  Typical 
characteristics  xf'1  about  sensor  i  include  sensing  modality 
(e.g.  what  kind  of  sensor  i  is),  sensor  position  and  other 
parameters,  such  as  the  noise  model  of  sensor  i.  Normally,  the 


sensor  characteristics  are  relatively  stable  comparing  with  the 
more  dynamic  measurements. 

Eq  (1)  is  a  general  form  of  the  observation  model  that 
accounts  for  possibly  nonlinear  relations  between  the  sensor 
type,  sensor  position,  noise  model  etc.  A  special  case  of  (1) 
would  be 

h(xW,Af})  =fi(x«,Af))+w‘  (2) 

where  f*  is  a  observation  function,  and  w*  is  additive,  zero 
mean  noise  with  known  covariance. 

In  order  to  illustrate  the  idea,  we  consider  the  problem 
of  stationary  target  localization  with  time-invariant  sensor 
characteristics.  In  this  paper,  we  assume  that  all  sensors  are 
acoustic  sensors  measuring  only  the  amplitude  of  the  received 
sound  signal  so  that  the  state  parameter  x  is  the  unknown 
target  position.  Note  that  under  our  assumption,  there  is  no 
longer  a  time  dependence  for  x  and  A  j.  Assuming  that  acoustic 
signals  propagate  isotropically,  the  parameters  are  related  to 
the  measurements  by 

Zi  =  7 — ^77^  +  wi  (3) 

llx-Cill2 

where  a,  is  a  given  random  variable  representing  the  am¬ 
plitude  of  the  signal  at  the  target,  a  is  a  known  attenuation 
coefficient,  and  ||-||  is  the  Euclidean  norm.  The  term  Wi  is  a 
zero  mean  Gaussian  random  variable  with  variance  erf. 

C.  Acoustic  Amplitude  Sensor 

There  are  two  common  types  of  sensors  for  detecting  and 
tracking:  acoustic  amplitude  sensors  and  direction-of-arrival 
(DOA)  sensors.  In  this  section,  we  detail  the  characteristics  of 
the  acoustic  amplitude  sensors. 

An  acoustic  amplitude  sensor  node  measures  sound  ampli¬ 
tude  at  the  microphone  and  estimates  the  distance  to  the  target 
based  on  the  physics  of  sound  attenuation.  Generally,  range 
sensors  estimate  distance  based  on  received  signal  strength  or 
time  difference  of  arrival  (TDOA). 

Assuming  that  the  sound  source  is  a  point  source  and  sound 
propagation  is  lossless  and  isotropic,  a  root-mean-squared 
(RMS)  amplitude  measurement  z  is  related  to  the  sound  source 
position  x  as 


where  a  is  the  RMS  amplitude  of  the  sound  source,  (  is  the 
location  of  the  sensor,  and  w  is  RMS  measurement  noise  [7], 
This  is  a  special  case  of  (3).  w  is  Gaussian  with  zero  mean 
and  variance  a2. 

III.  Double  Slide  Window  Event  Detection 

The  ability  of  a  sensor  receiver  to  detect  a  weak  echo  signal 
is  limited  by  the  noise  that  occupies  the  same  part  of  the 
frequency  spectrum  as  the  signal.  Detection  of  an  acoustic 
signal  is  based  on  establishing  a  threshold  at  the  output  of  the 
receiver.  If  the  receiver  output  exceeds  the  threshold,  a  target 


is  said  to  be  present.  This  is  called  threshold  detection.  Fig.  1 
represents  the  output  of  an  acoustic  receiver  as  a  function  of 
time.  The  fluctuating  appearance  of  the  output  is  due  to  the 
random  nature  of  receiver  noise. 

A  threshold  level  in  Fig.  1  is  shown  by  the  long  dashed 
line.  If  the  signal  is  large  enough,  as  at  A  and  B,  a  target 
is  reported  to  be  present,  but  C  is  a  missed  detection  at  the 
given  threshold.  The  signal  at  C  would  have  been  detected 
if  the  threshold  were  lower.  But  too  low  threshold  increases 
the  likelihood  that  noise  alone  will  exceed  the  threshold  and 
causes  false  alarm. 


Fig.  1.  Envelope  of  the  radar  receiver  output  as  a  function  of  time.  A,  B 
and  C  represent  signal  plus  noise.  A  and  B  would  be  valid  detections,  but 
C  is  a  missed  detection 


Fig.  2.  Illustration  of  double  sliding  window  event  detection,  A  and  B  are 
two  continuous  sampling  windows  with  same  length  and  B  arrives  after  A. 

The  calculation  of  the  window  A  and  window  B  value  is 
shown  as 


In  [8],  the  received  signal  strength  S  from  acoustic  sensors 
in  a  fixed  period  of  time  is  integrated,  when  it  exceeds  a 
threshold,  the  authors  claim  a  detection  of  event  occurred  as: 

M- 1 

S=J2\Zn-l\2  (5) 

1=0 

R  ^  R threshold  if) 

where  2  denotes  the  measurement  of  received  signal  strength 
at  each  sampling  point.  M  is  the  length  of  observing/sampling 
window. 

However,  this  simple  method  suffers  from  a  significant 
drawback;  namely,  the  value  of  the  threshold  depends  on  the 
sensed  signal  energy.  When  there  is  no  event  occuring  in  the 
sensing  range,  the  sensed  signal  consists  of  only  noise.  The 
level  of  the  noise  power  is  generally  unknown  and  can  change 
when  the  environment  changes  or  if  unwanted  interferers  go 
on  and  off.  Therefore,  it  is  quite  difficult  to  set  a  fixed 
threshold.  We  design  a  double  sliding  window  algorithm  for 
event-detection  so  as  to  alleviate  the  threshold  value  selection 
problem. 

The  double  sliding  window  event-detection  algorithm  cal¬ 
culates  two  consecutive  sliding  windows  of  the  sensed  signal 
energy.  The  basic  principle  is  to  form  the  decision  variable  as 
the  ratio  of  the  total  energy  contained  inside  the  two  windows. 
Fig.  2  shows  two  consecutive  windows  A  and  B  (note  that 
window  B  arrives  after  window  A)  and  the  response  of  the 
ratio  Rs  to  a  sensed  event.  It  can  be  seen  that  when  only 
noise  is  sensed  the  response  is  nearly  flat,  since  both  windows 
contain  ideally  the  same  amount  of  noise  energy. 


M-l 

5Q=^k„-I|2,  (7) 

1=0 

M—l 

Sb  =  \zn+M-l\2  ■  (8) 

1=0 

Then  the  decision  variable  Rs  is 

Rs  =  f*.  (9) 

The  advantage  of  this  approach  is  the  decision  variable  Rs 
does  not  depend  on  the  sensed  signal  energy,  but  on  the  ratio 
of  the  energy  of  two  consecutive  windows. 

IV.  Fundamental  Performance  Analysis 
A.  Fixed  Threshold  Event  Detection 

In  a  wireless  sensor  network  consisting  of  acoustic  sensors, 
the  received  signal  at  the  sensor  nodes  can  be  described  by 
Gaussian  probability  density  function  (pdf). 

If  there  is  no  event  (only  noise  exists)  in  the  observed 
sampling  window,  the  received  noise  at  the  sensor  follows 
Gaussian  distribution  with  mean  value  of  zero  and  variance 
fo- 


p(R )  = 


1 

i/27T0o exp  2-00 


(10) 


The  probability  of  false  alarm  which  is  the  probability  that 
the  envelop  of  the  noise  will  exceed  the  fixed  threshold  Vd 
can  be  determined  using  Q-function. 


p'-~Slvm^-wJdR=Q{w)  <"> 

where  Q-function  is  defined  as: 


Q(z) 


)=  I™  exp(-^-)dx  =  i 
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(12) 


If  there  is  an  event  occurring,  the  pdf  of  the  sensed  signal  to 
the  sensor  is  Gaussian  pdf  with  mean  value  of  m  and  variance 

V’o- 


p(R)  = 


_ 1_ 

V2nip0 


exp 


( R  —  m)2 
2ipo 


(13) 


The  probability  of  detection  Pd  which  is  the  probability 
that  the  envelope  R  will  exceed  the  threshold  Vd  can  also  be 
determined  using  Q-function. 
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We  are  interested  to  know  the  optimal  value  of  threshold 
VD.  To  get  VD,  we  use  maximum  a  posterior  (MAP)  detection. 
The  decision  boundary  is 


f(R\H2)  p(Hx) 

P{H 2)  U  J 

where  Hx  denotes  the  case  of  no  events  while  II2  denotes 
the  case  with  events.  f{R\Hx)  and  f(R\H2)  therefore  repre¬ 
sent  the  pdfs  of  the  two  cases  respectively.  In  one  observation, 
the  probability  of  no  events  equals  to  p(H i)  and  the  probabil¬ 
ity  of  events  equals  to  p(H2). 

Let  /?  =  p(Hx)/p(H2),  applying  (10)  and  (13)  we  get, 


y/2irip0  6XP 


(g~m)2l  =/3_i_exp(-^l 
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(16) 


Optimal  threshold  V'D  is  the  solution  of  (16)  and  V'D  can 
be  written  as: 


•  Detecting  an  event  -  Window  A  represents  background 
noise  and  window  B  represents  the  occurring  events  as 
shown  in  Fig.  3. 

•  False  alarm  -  Window  A  and  B  both  represent  backgroud 
noise  but  the  decision  variable  Rs  exceeds  the  threshold 
as  shown  in  Fig.  4. 

We  then  analyze  the  foundamental  performance  -  the  prob¬ 
ability  of  detection  and  the  probability  of  false  alarm  in  the 
DSW  detection  scheme. 


Window  B  -  Event 


Window  A  -  Noise 


Fig.  3.  Case  of  Detecting  an  Event  (window  B  arrives  after  window  A) 


1)  Probability  of  Detection:  In  Fig.  3,  observation  of  win¬ 
dow  A  is  Gaussian  noise  with  zero  mean  and  variance  ip o  and 
window  B  represents  event  pdf  which  is  also  Gaussian  but 
with  mean  value  of  m  and  variance  ip  q. 

Let  X  —  Sb,  Y  =  Sa  and  Z  =  Rs  =  Sb/Sa  (referring  to 
Section  III).  We  get  the  pdf  of  decision  variable  Z  =  X/Y 
in  the  following. 

Since  random  variable  X  and  Y  are  identically  independent, 
we  have 


POO 

fz{z)  =  /  yfx=yz{x  =  yz)fv(y)dy 
J  y=0 


(18) 


Using  the  pdf  of  observing  window  A  and  B,  we  get, 


fz{z)  =  [ 

J  V 


r  i 

( yz  -  m)2 

WV2niPo 
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The  pdf  of  decision  variable  Z 
simplifying  (19), 


f2mpo  2 ip0' 
Rs  can  be  get  by 


fz(z)  =  f 
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2ip0 

dy  (20) 


The  probability  of  detection  in  DSW  detection  scheme,  i.e., 
the  probability  that  the  envelop  of  the  decision  variable  Z  — 
Rs  will  exceed  the  given  threshold  Sd  is  given  as  below. 
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dy  (21) 


B.  Double  Sliding  Window  Event  Detection 

In  double  sliding  window  (DSW)  detection,  decision  is 
made  over  two  consecutive  sampling  windows.  According  to 
the  example  in  Section  III,  an  event  or  false  alarm  is  reported 
when  the  decision  variable  Rs  exceeds  a  given  threshold  Sd 
(note  Sd  is  different  from  signal  strength  threshold  VD  in 
Section  IV-A). 

In  the  case  of  two  consecutive  windows  A  and  B  (note  that 
window  B  arrives  after  window  A),  detecting  an  event  and 
false  alarm  occur  respectively  in  the  following  two  conditions: 


Window  B  -  Noise 


Window  A  -  Noise 


Fig.  4.  Case  of  False  Alarm  (window  B  arrives  after  window  A ) 

2)  Probability  of  False  Alarm:  When  window  A  and  B 
both  represent  backgroud  noise  as  shown  in  Fig.  4  but  the 
decision  variable  Rs  exceeds  the  threshold  Sd,  a  false  alarm 
is  reported.  In  this  case,  the  pdf  of  decision  variable  Z  =  Rs 
is  derived  similarly  starting  from  (18). 


V.  Conclusions 


,  ,  r  V  (  (v02\  i 

h(z)  -  { -*h)  "p<“ 


/  y=U 

It  follows  that: 


h(z)-  ikC*exi’(~1%fv2)dy 
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Replacing  y 2  with  s,  (23)  becomes: 

h{z)=^hLav{J^£’)ds 

fz(z )  can  be  solved  as: 

fz(z) 


^2  Measures  of  performance  for  wireless  sensor  network  ap- 

— —  )dy  (^2)cations  are  defined  in  various  ways  in  which,  detection 
^  '  probability  and  false  alarm  probability,  classification  errors  and 

track  quality  have  been  widely  used. 

In  this  paper,  we  studied  the  performance  of  event  detection 
in  wireless  sensor  network.  We  introduced  a  new  detection 
algorithm  -  double  sliding  window  (DSW)  event  detection 
where  detection  decision  is  made  over  two  consecutive  sam¬ 
pling  windows.  We  analyzed  the  fundamental  performace  - 
(23)  {he  probability  of  detection  and  the  probability  of  false  alarm 
over  this  new  detection  scheme  and  compared  it  theoretically 
against  the  fixed  threshold  algorithm.  We  believe  that  our 
DSW  detection  will  practically  approach  or  exceed  the  fixed 
threshold  detection.  Simulations  over  Xbow  wireless  sensor 
network  professional  developer’s  kit  will  be  provided  in  the 
later  version. 


27r(l  +  z2) 
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(25) 


The  probability  of  false  alarm  in  DSW  detection  scheme, 
i.e.,  the  probability  that  the  envelop  of  the  decision  variable 
Z  =  will  exceed  the  given  threshold  SD  is, 


Pfa  =  f 

JSr 


lsD  2tt(1  +  z2) 
Using  indefinite  integration  formula: 


dz 


/: 
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a2  +  x2  a 
The  probability  of  false  alarm  can  be  solved  as: 
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Observe  that  in  DSW  event  detection  scheme,  the  probabil¬ 
ity  of  false  alarm  does  not  depend  on  the  noise  variance  but 
only  on  the  decision  variable  threshold  So- 

Similarly,  we  use  MAP  detection  to  get  optimal  threshold 
Sd  in  DSW  event  detection. 

Assume  (5  =  p(H\)/p(H2)  (same  as  in  fixed  threshold 
detection),  applying  (20)  and  (25)  we  get: 
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Solving  (29)  gives  the  optimal  threshold  Sd  in  DSW  event 
detection. 
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Abstract — In  this  paper,  a  novel  asynchronous  energy-efficient 
MAC  protocol,  ASCEMAC,  is  proposed  for  wireless  sensor 
networks.  We  combine  both  contention-based  and  schedule-based 
MAC  protocols’  energy  saving  strategies  in  our  algorithm.  In 
ASCEMAC,  by  applying  free-running  method  and  fuzzy  logic 
rescheduling  scheme,  time  synchronization  which  is  necessary 
in  existing  energy-efficient  MAC  protocols  is  not  required  any 
more.  Moreover,  we  present  a  traffic  intensity  and  network 
density-based  model  to  determine  essential  algorithm  parameters, 
such  as  power  on/off  duration,  interval  of  schedule  broadcast 
and  super-time-slot  size  and  order.  Simulation  results  show  that 
our  algorithm  ensures  the  average  successful  transmission  rate, 
decreases  the  data  packet  average  waiting  time,  and  reduces  the 
average  energy  consumption.  Therefore,  network  performance 
is  improved  and  network  lifetime  is  extended  by  using  our 
algorithm. 

I.  Introduction 

For  wireless  sensor  networks  (WSNs),  energy  saving  is  becoming 
more  and  more  important,  due  to  nodes’  limited  energy  resource. 
Some  solutions  for  saving  energy  at  MAC  layer  for  WSNs  are  put 
forward.  They  can  be  classified  into  two  main  categories,  according 
to  their  channel  access  strategies:  contention-based  MAC  protocols 
and  schedule-based  MAC  protocols. 

In  schedule-based  energy-efficient  MAC  protocols:  a  new  standard 
named  IEEE  802.15.4  [5]  has  been  developed.  It  concentrates  on 
providing  a  physical-layer  and  MAC-layer  standard  with  ultra-low 
complexity,  cost,  and  power  for  low-data-rate  wireless  connectiv¬ 
ity  among  cheap  fixed  devices;  Traffic-Adaptive  Medium  Access 
(TRAMA)  [2]  employs  a  traffic  adaptive  and  distributed  election 
scheme  to  allocate  the  system  time  for  different  sensor  nodes;  in 
EMACS  [12],  only  active  nodes  monitor  new  communication  requests 
from  passive  nodes.  Notice  that,  through  appointing  transmission 
time  for  different  sensor  nodes,  these  schedule-based  MAC  protocols 
reduce  the  energy  consumption  on  collision  and  idle.  But,  how 
to  allocate  time-slots  efficiently  and  fairly  is  one  of  the  biggest 
challenges  for  them. 

In  contention-based  energy-efficient  MAC  protocols:  S-MAC[3], 
divides  the  system  time  into  frames.  During  the  sleeping  part,  a 
node  powers  off  its  radio  to  save  energy,  and  it  performances 
communications  during  the  active  part;  T-MAC  is  proposed  in  [4], 
This  protocol  enables  each  node  to  dynamically  and  locally  adjust 
the  communication  duration  based  on  each  node’s  traffic.  We  can  see 
that,  these  contention-based  MAC  protocols  implement  energy  saving 
through  adjusting  all  nodes’  communications  into  a  certain  period  of 
time. 

In  all  previously  mentioned  energy-efficient  MAC  protocols,  even 
though  the  mechanisms  of  managing  power  on/off  period  are  distin¬ 


guished,  accurate  time  synchronization  method  [7]  is  the  common 
premise  to  ensure  saving  energy  and  communicating  successfully 
among  nodes. 

As  we  know,  the  quality  of  each  sensor  node’s  clock  usually  boils 
down  to  its  frequency  stability  and  frequency  accuracy  [7].  In  general, 
as  frequency  stability  and  accuracy  increase,  so  do  their  power 
requirements,  size  and  cost,  which  are  all  troublesome  for  general 
sensor  nodes.  Hence,  clock  drifts  are  unavoidable  in  most  WSNs, 
which  are  introduced  by  unstable  and  inaccurate  frequency  standards. 
In  this  case,  there  must  be  some  unsuccessful  communications  caused 
by  uncoincidently  switching  back-and-forth  between  power  on/off 
states  (we  call  these  mismatch  operations),  without  a  correct  global 
clock  established  by  time  synchronization  for  previously  mentioned 
energy-efficient  MAC  protocols.  In  the  algorithm  description  part, 
we  will  discuss  further  how  clock  drift  results  in  unsuccessful 
communications. 

Moreover,  in  all  previously  mentioned  energy-efficient  MAC  pro¬ 
tocols,  how  to  determine  the  durations  of  power  on/off  phases  are 
seldom  discussed.  But,  these  two  durations  are  closely  related  to 
system  performances,  such  as  energy  efficiency  and  throughput.  In 
WSNs,  the  traffic  in  general  has  a  heterogeneous  nature  [6],  i.e.,  the 
traffic  arrival  rate  for  different  nodes  or  even  for  the  same  node  at 
different  time  is  fluctuating  considerably  during  the  network  lifetime. 

In  this  paper,  we  present  an  asynchronous  energy-efficient  MAC 
protocol:  ASCEMAC,  which  not  only  outperforms  the  existing 
energy-efficient  MAC  protocols,  but  also  removes  the  tight  depen¬ 
dency  on  time  synchronization.  We  combine  both  contention-based 
and  schedule-based  MAC  protocols’  energy  saving  strategies  in  our 
algorithm.  In  ASCEMAC,  by  applying  free-running  method  and 
fuzzy  logic  [10]  rescheduling  scheme  to  set  up  phase-switching 
schedules  and  compensate  clock  drifts  among  nodes.  Moreove,  we 
present  a  traffic  intensity  and  network  density-based  model  to  de¬ 
termine  the  essential  algorithm  parameters,  such  as  power  on/off 
duration,  interval  of  schedule  broadcast,  super-time-slot  length  and 
order. 

The  remainder  of  this  paper  is  organized  as  follows:  our  AS¬ 
CEMAC  design  is  described  in  Section  II;  simulation  results  are  given 
in  Section  III;  Section  IV  concludes  this  paper. 

II.  ASCEMAC  Protocol  Description  and  Discussion 

We  use  Energy-Efficient  Self-Organization  (ESO)  [9]  algorithm  to 
form  clusters.  Each  cluster  has  only  one  cluster  head.  The  radius  of. 
a  cluster  is  the  communication  range  of  the  cluster  head.  Nodes  in 
one  cluster  can  talk  to  their  neighbors  directly.  The  wireless  media 
(or  the  common  channel)  access  scheme  within  a  cluster  is  specified 
by  our  ASCEMAC. 

ASCEMAC  divides  system  time  into  four  phases:  PRFR-Phase, 
Schedule-Proadcast-Phase,  On-Phase  and  Off-Phase.  An  on/off  rota- 


tion  consists  of  two  adjacent  On-Phase  and  Off-Phase.  It  is  a  fixed- 
schedule  stage  between  two  adjacent  Bchedule-Broadcast-Phases. 
The  fixed-schedule  stage  consists  of  several  on/off  rotations.  Fig.l 
presents  the  system  time  scheme  structure.  The  function  for  each 
phase  is: 
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Fig.  1.  System  Time  Scheme  Structure 

•  PRFR-Phase  is  preserved  for  normal  nodes  to  send  Traffic-Rate 
&  Failure-Rate  (TRFR)  messages  to  their  cluster  head; 

•  Schedule-Broadcast-Phase  is  preserved  for  cluster  head  to  lo¬ 
cally  broadcast  phase-switching  schedules  within  their  control 
range; 

•  Off-Phase  is  preserved  for  all  nodes  to  power  off  their  radios. 
In  this  phase,  there  is  no  communication,  but  data  storing  and 
sensing  may  happen; 

•  On-Phase  is  preserved  for  all  nodes  to  power  on  their  radios  to 
make  communication.  In  this  phase,  the  system  time  is  further 
divided  into  super-time-slots,  which  are  composed  of  several 
normal  time-slots.  Each  super-time-slot  is  continuously  used  by 
one  source-destination  pair.  One  normal  time-slot  is  a  period 
of  time  ( Td )  to  complete  one  data  transmission  from  source  to 
destination. 

In  ASCEMAC,  each  node  informs  the  cluster  head  its  traffic 
intensity,  failure  transmission  and  buffer  overflow  through  TRFR 
message  (see  Section  A).  Based  on  that  information,  the  cluster  head 
determines  the  power  on/off  duration  (see  Section  B),  the  interval 
of  schedule  broadcast  (see  Section  C),  as  well  as  the  length  and  the 
order  of  super-time-slot  (see  Section  D).  After  receiving  the  schedule 
broadcast  message  from  the  cluster  head,  each  node  sets  up  its  own 
phase-switching  schedule.  Since  then,  each  node  starts  to  power  on 
its  radio  to  make  communication  and  to  power  off  its  radio  to  save 
energy  according  to  its  phase-switching  schedule.  We  will  describe 
our  ACEMAC  in  detail  in  the  following  sections. 

A.  TRFR  Message  Design 

TRFR  message  (see  Fig.2)  is  sent  by  normal  node  at  TRFR-Phase. 


Typo  Source  Data  Arrival  Rate  I  Failure  Rate  Overflowing  Rate 


Fig.  2.  TRFR  Message  Format 

•  “Data  Arrival  Rate”  is  the  number  of  data  packets  coming  from 
node’s  sensing  component  per  second; 

•  “Failure  Rate”  is  the  rate  of  unsuccessfully  transmitted  data 
packets,  caused  by  mismatch  operations,  to  total  transmitted 
data  packets; 

•  “Overflowing  Rate”  is  the  rate  of  overflowing  data  packets, 
caused  by  improper  power  off  duration,  to  total  data  packets 
coming  from  node’s  sensing  component. 

In  our  algorithm,  we  add  an  ACK  message  as  the  acknowledgment 
for  successfully  receiving.  A  transmission  is  defined  as  unsuccessful 
when  the  transmitter  does  not  receive  ACK  after  certain  period  of 
time. 

During  each  on/off  rotation,  each  node  independently  estimates  its 
data  traffic  arrival  rate,  unsuccessful  transmission  rate  and  overflow¬ 
ing  rate.  But  those  rates  sent  to  its  cluster  head  are  the  average  values 
on  all  on/off  rotations. 


During  7RFR-Phase,  each  node  randomly  chooses  a  time  to  send 
TRFR  message,  and  this  random  transmission  process  complies  with 
an  uniform  distribution.  The  working  process  is  almost  similar  to 
CSMA  [1],  Notice  that,  transmission  time’s  randomness  and  carrier 
sense  reduce  the  collision  possibility  and  increase  the  successful 
transmission  possibility  of  TRFR  message.  In  the  simulation  section 
(Section  III),  the  experiment  on  TRFR  message  will  show  that  normal 
nodes  within  a  cluster  have  a  very  high  probability  to  send  TRFR 
messages  to  their  cluster  head  successfully. 


B.  Power  on/off  Duration  (Tn/Tf)  Design 

Designing  Tf,  we  consider  the  following  factors: 

•  During  Off-Phase  and  other  nodes’  transmission  time,  a  node 
stops  communicating,  but  there  are  still  data  packets  arriving 
from  sensing  component; 

•  For  each  node,  the  buffer  space  is  limited.  When  buffer  is  used 
up  (or  overflowed),  the  following  incoming  data  packets  must 
be  discarded; 

•  There  is  a  lifetime  for  each  data  packet.  So  the  traffic  over  the 
network  is  sensitive  to  waiting  time. 

Based  on  the  traffic  arrival  rate,  buffer  space  and  traffic  lifetime, 
we  design  Off-Phase  duration  (Tf)  to  avoid  buffer  overflow  and  keep 
information  up  to  date  at  most  degree.  If  we  know  the  maximum 
waiting  time  Wmax,  the  buffer  size  fc  and  the  traffic  arrival  rate  A; 
for  node  i,  Tj  can  be  calculated  using 

T/=min|(2Wmoa:-Tn),min(^--Tn)j  (1) 

In  general  WSNs,  each  node  has  similar  capability.  Therefore,  we 
can  let  ki= K  (i=l,2>. . .).  Then  (1)  is  changed  to 

Tf  =  min  | (2 Wmax  -  Tn),  mm(^  -  Tn)  J  (2) 

It  is  obvious  that  the  longer  the  Off-Phase  is,  the  more  the  energy 
is  saved.  However,  the  average  waiting  time  of  data  packets  will 
increase  as  the  duration  of  Off-Phase  increasing.  So  there  is  a  trade¬ 
off  between  saving  energy  and  reducing  waiting  time. 

During  On-Phase,  nodes  start  to  send/receive  data  packets.  In  this 
phase,  system  time  is  divided  into  slots.  Certain  number  of  time  slots 
are  continuously  occupied  by  a  source-destination  pair.  There  is  no 
competition  and  carrier  sense  at  On-Phase.  Knowing  average  traffic 
arrival  rate  A,  for  node  i,  Off-Phase  duration  (Tf)  and  totally  N  nodes 
in  this  cluster,  Tn  can  be  calculated  as 


r  TfTfTZ^ 
"  1-TaYZ^i 


(3) 


Combining  (2)  and  (3),  we  obtain  the  final  equations  for  Tf  and 
Tn.  There  are  two  cases: 


1)  when  iWmax  <  minj(j£) 

Tf  =  2Wmax(l  -  Td<t>) 

Tn  =  2WrnaxTd<j> 

2)  when  2 Wmax  >  minj(^f) 

Tf  =  min(y-)(l  —  Td<f>) 

*  A% 


(4) 

(5) 

(6) 


Tn  =  <j>Td  min(f)  (7) 

1  Ai 

where  <j>  is  the  sum  of  N  nodes’  traffic  arrive  rate,  defined  as  <p  = 

EfeiA;. 

Notice  that,  in  our  On-Phase  and  Off-Phase  durations  designing, 
we  try  to  extend  the  power  off  time  to  save  more  energy,  and  also 
adjust  data  packets’  waiting  time  to  an  acceptable  value. 


C.  Phase-Switching  Schedule  Establishment  and  Interval  of  Schedule 
Broadcast  Design 

Free-running  is  a  timing  method  which  allows  each  node  to  run  on 
its  own  clock.  ACEMAC  use  free-running  method  to  save  energy  and 
spectrum  resources  because  free-running  method  does  not  maintain 
a  global  clock  within  a  cluster.  Furthermore,  we  design  a  schedule 
broadcast  message  (see  Fig.  3).  Cluster  head  generates  this  message 
and  broadcasts  within  this  cluster.  The  function  for  each  field  of 


Type 

SRC 

Off-Duration 

On-Duration 

SRC_1 

DESTJ 

Defer-Duration_1 

Slot-Duration_1 

SRC_2 

DEST.2 

Defer-Duration_2 

Slot-Duration_2 

SRCJ 

DESTJ 

Defer- Duration  J 

Slot-Duration_i 

Fig.  3.  Schedule  Broadcast  Packet  Format 
schedule  broadcast  message  is: 

•  “On-Duration”  specifies  when  all  nodes  should  switch  to  Off- 
Phase; 

•  “Off-Duration”  field  regulates  how  long  all  nodes  should  stay 
at  one  On/Off  rotation; 

•  “Slot-Duration-i”  field  regulates  the  length  of  ith  super-time¬ 
slot; 

•  “Defer-Duration-i”  is  designed  to  inform  nodes  after  how  long 
the  ith  super-time-slot  starts  for  an  On-Phase; 

•  “SRC.i”  and  “DEST  j”  fields  regulate  the  source  and  destination 
of  ith  super-time-slot. 

If  clock  drifts  do  not  exist,  coincident  phase-switching  schedule  is 
supposed  to  be  set  up  at  each  node,  based  on  each  node’s  own  local 
clock  and  this  schedule  broadcast  message.  These  phase-switching 
schedules  ensure  the  match  operations  among  nodes. 

But,  as  we  mentioned  earlier,  mismatch  operations  among  nodes 
are  unavoidable  because  there  are  always  some  clock  drifts  caused 
by  unstable  and  inaccurate  frequency  standards. 

The  following  example  illustrates  how  clock  drift  results  in  a 
mismatch  operation,  and  how  our  ASCEMAC  removes  this  mismatch 
to  ensure  successful  communication.  There  is  a  source-destination 
pair,  nodes  A  and  B.  If  the  frequency  standard  for  A  is  faster  than 
that  of  B  and  super-time-slot-1  is  the  time-slot  of  A  and  B,  A  will  run 
into  super-time-slot-1  preceding  B  for  an  unneglectable  time  (Ati) 
after  a  period  of  time.  During  Ati,  data  transmissions  cannot  be  done 
successfully  between  them,  because  B’s  radio  is  still  off.  See  Fig.  4. 
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Fig.  4.  Mismatch  Operation  Due  to  Clock  Drift 

Schedule  broadcast  is  responsible  for  removing  mismatch,  in 
addition  to  informing  nodes  about  phase-switching  schedules.  From 
Fig.  5,  we  see  that  Ati  between  nodes  A  and  B  is  successfully 
removed  after  receiving  a  new  schedule  broadcast  message. 

From  the  above  discussion,  we  can  see  that  ensuring  nodes  against 
mismatch  operations  can  avoid  unsuccessful  transmissions,  which  are 
caused  by  clock  drifts. 

However,  it  is  unnecessary  to  offer  match  operations  at  all  time  and 
for  all  nodes.  For  instance,  two  nodes,  which  have  little  information 
to  exchange,  do  not  need  to  switch  phases  coincidently,  since  their 
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Fig.  5.  Mismatch  Operation  Removed  by  Re-schedule 


mismatch  operations  have  little  effect  on  information  transmission. 
In  that  case,  some  nodes  could  be  allowed  to  go  out  of  coincidence, 
and  be  rescheduled  only  if  necessary. 

There  is  an  another  function  for  schedule  broadcast,  besides 
removing  mismatch  and  informing  phase-switching  schedules.  That 
is,  the  cluster  head  can  acquire  more  suitable  durations  for  power 
on/off  phase  according  to  current  traffic  conditions.  For  WSNs,  the 
traffic  is  heterogeneous.  With  the  vibration  of  traffic  arrival  rate, 
previously  chosen  Tf  and  T„  may  not  optimum  any  more.  For 
example,  when  the  traffic  arrival  rate  increases,  more  data  packets 
arrive  during  Off-Phase  and  On-Phase,  so  the  possibility  for  buffer 
overflowing  will  increase.  In  another  case,  when  the  traffic  arrival  rate 
decreases,  less  data  packets  arrive  during  Off-Phase  and  On-Phase, 
so  some  energy  is  wasted  by  idle  at  On-Phase. 

We  adopt  an  adaptive  adjustment  method  to  determine  the  interval 
of  schedule  broadcast.  This  method  can  save  energy  through  avoiding 
unnecessary  schedule  broadcasts  and  idle,  as  well  as  ensure  an 
acceptable  data  successful  transmission  rate. 

We  use 

Ti  =  ti  x  2l-i  (8) 

as  the  interval  adjusting  function,  where  Ti  is  the  ith  interval  of 
schedule  broadcast,  is  the  ith  adjustment  factor  and  is  a  positive 
numeral. 

We  design  a  rescheduling-FLS  to  determine  the  value  of  which 
reflects  the  influence  degree  of  clock  drifts  and  traffic  intensity 
changes  on  communications. 

In  our  rescheduling-FLS.  there  are  three  antecedents: 

•  the  ratio  of  nodes  with  overflowed  buffer  (j?0/); 

•  the  ratio  of  nodes  with  high  failing  transmission  rate  (Rhf)’, 

•  the  ratio  of  nodes  experiencing  unsuccessful  transmission  ( Rar ). 
The  consequent  is  the  adjustment  factor  for  the  interval  of  schedule 
broadcast^).  The  linguistic  variables  used  to  represent  R„j,  Rhj 
and  Rsr  are  divided  into  three  levels:  Low,  Moderate  and  High. 

is  divided  into  5  levels,  Highly  Decrease,  Decrease,  Unchange, 
Increase  and  Highly  Increase.  We  show  these  MFs  in  Fig.  6  and 
Fig.  7. 


We  design  our  rescheduling-FLS  using  rules  with  one  example 
shown  below: 

P!:IF  the  ratio  of  nodes  (cci)  with  overflow  buffer  is  High, 
the  ratio  of  nodes  (£2)  with  high  failure  rate  is  High  and  the 
ratio  of  nodes  (£3)  experiencing  unsuccessful  transmission 
is  High,  THEN  the  adjustment  factor  for  the  interval  of 
schedule  broadcast  (£)  should  be  Highly  Decrease. 


Fig.  7.  Consequent  Membership  Function 


TABLE  I 

The  rules  for  adjusting  the  interval  of  schedule  broadcast. 
AnteI  is  the  ratio  of  nodes  having  overflowed  buffer.  Ante2  is 

THE  RATIO  OF  NODES  WITH  HIGH  FAILURE  TRANSMISSION  RATE.  ANTE3 
IS  THE  RATIO  OF  NODES  OWNING  UNSUCCESSFUL  TRANSMISSION.  AND 

Consequent  is  the  adjustment  factor  for  the  interval  of 

SCHEDULE  BROADCAST. 


Antel 

Ante2 

Ante3 

Consequent 

1 

Low 

Low 

Low 

Highly  Increase 

2 

Low 

Low 

Moderate 

Increase 

3 

Low 

Moderate 

Moderate 

Decrease 

4 

Low 

Moderate 

High 

Decrease 

5 

Moderate 

Low 

Moderate 

Increase 

6 

Moderate 

Low 

High 

U nchange 

7 

Moderate 

Moderate 

Moderate 

Decrease 

8 

Moderate 

Moderate 

High 

HighlyDecrease 

9 

Low 

High 

High 

Decrease 

10 

Moderate 

High 

High 

H  ighlyDecrease 

11 

High 

Low 

Moderate 

Increase 

12 

High 

Low 

High 

Unchange 

13 

High 

Moderate 

Moderate 

Decrease 

14 

High 

Moderate 

High 

Decrease 

15 

High 

High 

High 

HighlyDecrease 

We  summarize  all  meaningful  rules  in  Table  I. 

For  every  input  (£1,0:2, £3),  the  output  is  defuzzified  [8]  using 

r.  .  Ei=i  ?mf<  (*iW  (Z2W  (£3) 

£(X1,£2,£3)  =  Js - -  —  2  - 3——  (?) 

Ei=l  HF{(Xi)P-F>2 

The  height  of  the  five  fuzzy  sets  depicted  in  Fig.  7  are  6=0.2,  6=0-5, 
6=1.0,  6=3.0,  6=4.0. 

The  inputs  of  rescheduling-FLS  are  acquired  from  TRFR  messages 
sent  by  all  normal  nodes.  Before  broadcasting  schedules,  cluster  head 
estimates  the  influence  degree  of  clock  drifts  and  traffic  intensity 
changes  on  communications  using  rescheduling-FLS.  After  obtaining 
6,  the  cluster  head  uses  (8)  to  determine  the  value  for  the  next  interval 
of  schedule  broadcast. 

D.  Time-SIot  Assignment 

For  classic  TDMA  systems,  such  as  GSM  system,  the  system  time 
is  divided  into  slots,  and  each  user  occupies  cyclically  repeating  time 
slots.  A  typical  TDMA  system  transmits  data  in  a  buffer-and-burst 
method,  thus  the  transmission  for  any  user  is  non-continuous  and  a 
high  quality  time  synchronization  is  needed. 

But,  in  ASCEMAC,  there  is  no  time  synchronization  and  global 
clock  in  the  system.  In  this  case,  the  successful  transmission  possi¬ 
bility  is  supposed  to  be  degraded  if  we  still  utilize  that  buffer-and- 
burst  method  to  schedule  communications.  The  following  example 
illustrates  the  relationship  between  the  length  of  super-time-slot  and 


the  mismatch  operation.  See  Fig.  4,  there  is  a  Afi  time  difference 
between  nodes  A,  the  source,  and  B,  the  destination.  A  starts  sending 
at  the  beginning  of  its  super-time-slot,  k  is  the  number  of  data 
packets  sent  during  this  transmission  period.  Ts,min  is  the  least  time 
needed  to  detect  the  synchronization  information  of  a  data  packet. 
We  consider  two  cases: 

1)  If  TstTnin  <1  Atl  <  Td 

a)  When  k=l,  no  packet,  sent  during  this  slot,  can  be 
received  by  node  B,  i.e.,  0%  successful  transmission  rate; 

b)  When  k=3,  two  packets,  sent  during  this  slot,  can  be 
received  by  node  B,  i.e.,  67%  successful  transmission 
rate; 

c)  When  k=n,  n-1  packets,  sent  during  this  slot,  can  be 

received  by  node  B,  i.e.,  successful  transmission 

rate. 

2)  If  Td  <  Ati  <  2 Td 

a)  When  k=l,  no  packet,  sent  during  this  slot,  can  be 
received  by  node  B,  i.e.,  0%  successful  transmission  rate; 

b)  When  k=3,  one  packet,  sent  during  this  slot,  can  be 
received  by  node  B,  i.e.,  33%  successful  transmission 
rate; 

c)  When  k=n,  n-2  packets,  sent  during  this  slot,  can  be 
received  by  node  B,  i.e.,  rL^-%  successful  transmission 
rate. 

Notice  that  with  the  increasing  of  k,  more  transmissions  are 
done  successfully  under  the  same  mismatch  condition.  Therefore, 
continuously  occupying  the  common  channel  for  several  time-slots  by 
one  source-destination  pair  is  an  effective  way  to  tolerate  mismatch 
between  source  and  destination. 

In  our  algorithm,  we  adopt  a  non-buffer-and-burst  method  to 
transmit  data.  That  is,  based  on  the  number  of  data  packets  waiting 
for  transmission  and  unsuccessful  transmission  rate,  we  design  an 
allocation-FLS  to  correspondingly  allocates  a  certain  size  of  super- 
time-slot  to  each  node. 

There  are  two  antecedents  for  our  allocation-FLS: 

•  traffic  arrival  rate  ( Ra ); 

•  the  transmission  failure  rate  (RUs). 

The  consequent  is  the  priority  of  this  node  performing  transmission 
( Ptl 

We  also  use  antecedent  MFs  in  Fig.  6  and  consequent  MFs  in 
Fig.  7. 

We  design  our  allocation-FLS  using  rules  with  one  example  shown 
below: 

F?(:IF  the  traffic  arrival  rate  (xi)  is  High  and  the 
unsuccessful  transmission  rate  (X2)  is  Low,  THEN  the 
priority  of  this  node  performing  transmission (y)  should  be 
Very  Low. 

We  summarize  all  rules  in  Table  II. 

With  the  allocation-FLS,  the  cluster  head  utilizes  the  information 
acquired  from  TRFR  messages  to  calculate  a  priority  for  each  node. 
The  node  owning  the  highest  priority  is  the  first  one  to  make 
communications  during  an  On-Phase. 

In  summary,  we  have  described  the  whole  process  that  how 
to  determine,  establish  and  maintain  phase-switching  schedules  for 
saving  energy  and  communicating  successfully  among  nodes. 

III.  Simulations  and  Performance  Evaluation 

We  run  simulations  using  OPNET.  Nodes  are  deployed  randomly 
in  an  area  of  1000m  x  1000m.  The  radio  range  is  30  meters,  symbol 
rate  is  40 ksps  and  data  frame  length  is  1024  bits.  For  each  node,  the 
clock  drift  rate  ranges  from  1  to  100/ts. 

We  use  the  same  energy  consumption  model  as  in  [11]  for  the 
radio  hardware.  To  transmit  an  f-symbol  message  a  distance  d,  the 
radio  expends: 

LjTx  ( l ,  d)  —  L,Tx  —  elec(l)  "F  Txx  —  amp{l,d )  =  I  X  Eelec  4 “  I  X  efs  X  d 

(10) 


TABLE  II 

The  rules  for  super-time-slot  allocation.  Antecedent  1  is 

TRAFFIC  ARRIVAL  RATE.  ANTECEDENT  2  IS  THE  UNSUCCESSFUL 
TRANSMISSION  RATE.  AND  CONSEQUENT  IS  THE  PRIORITY  OF  THIS  NODE 
PERFORMING  TRANSMISSION. 


the  traffic  arrival  rate  of  0.1,  0.2  and  0.5  pks/s.  It  shows  that,  the 
vibration  of  successful  transmission  rate  with  the  change  of  nodes 
number  is  less  than  97.099%  —  96.087%  =  1.012%.  These  two 
experiments  show  that  our  ASCEMAC  is  a  network  density  and  traffic 
intensity  adaptive  method. 


2S3SI | 

Antecedent 1 

Antecedent2 

Consequent 

i 

Low 

Low 

Moderate 

2 

Low 

Moderate 

High 

3 

Low 

High 

VeryHigh 

4 

Moderate 

Low 

Low 

5 

Moderate 

Moderate 

Moderate 

6 

Moderate 

High 

High 

7 

High 

Low 

Very  Low 

8 

High 

Moderate 

Low 

9 

High 

High 

Moderate 

and  to  receive  this  message,  the  radio  expends: 

Erx  =  /  X  Eelec  (11) 

The  electronics  energy,  Eeuc,  as  described  in  [11],  depends  on 
the  factors  such  as  coding,  modulation,  pulse-shaping  and  matched 
filtering,  and  the  amplifier  energy,  e/s  x  d2  depends  on  the  distance 
to  the  receiver  and  the  acceptable  bit  error  rate.  In  this  paper,  we 
choose:  £eiec  =  50nJ/syn,  e/s  =  10 pJ/sym/m2. 

A.  TRFR  Message  Successful  Transmission  Probability 

Fixing  the  duration  of  7HFR-Phasc  at  5,  10,  15,  20,  25  and 
30  seconds  separately  and  increasing  the  number  of  nodes  in  a 
cluster  from  5  to  30,  we  obtain  a  series  of  curves  on  successful 
transmission  rate  of  data  packets  (see  Fig.  8).  Notice  that,  if  TOFR- 
Phase  duration  is  longer  than  10s,  TRFR  message  for  each  node  has 
almost  99%  probability  to  be  sent  successfully  to  the  cluster  head, 
This  result  proves  that,  for  our  algorithm,  the  cluster  head  can  acquire 
the  necessary  information  from  normal  nodes  to  determine  system 
schedules  successfully. 


Fig.  8.  Successful  Transmission  Rate  for  TRFR  Message 


B.  ASCEMAC  Adaptation 

We  investigate  the  influences  of  the  network  density  and  the  traffic 
intensity  on  the  system  performance  of  our  algorithm.  In  Fig.  9,  we 
plot  the  number  of  nodes  in  a  cluster  versus  successful  transmission 
rate  of  data  packets.  We  run  the  simulations  under  4  different  average 
clock  drift  rate,  i.e.,  0.0,  0.001,  0.01  and  O.lms/s.  Observe  that,  for 
each  clock  drift  rate,  the  vibration  of  successful  transmission  rate 
with  the  change  of  nodes  number  is  less  than  85.714%  —  83.606%  = 
2.108%.  In  Fig.  10,  we  compare  the  successful  transmission  rate  at 


Fig.  9.  Successful  Transmission  Rate 


Fig.  10.  Successful  Transmission  Rate 


C.  ASCEMAC  vs.  S-MAC  and  TRAMA 

We  compare  our  ASCEMAC  against  S-MAC  and  TRAMA.  In 
Fig.  11,  we  plot  the  average  clock  drift  rate  versus  average  energy 
consumption.  Notice  that  ASCEMAC  can  save  about  from  68.263% 
to  189.232%  energy  per  packet  compared  to  TRAMA  and  S-MAC. 
That  means,  when  ASCEMAC  is  used  instead  of  TRAMA,  the 
lifetime  for  a  same  WSN  can  be  increased  at  least  one  time, 
and  for  S-MAC  the  lifetime  even  can  be  increased  at  least  three 
times.  From  this  experiment,  notice  that  the  schedule-based  MAC 
protocols  have  better  performance  on  energy  saving  than  contention- 
based  MAC  protocols.  The  reason  is  that  some  energy  is  consumed 
through  making  competition  for  accessing  the  common  channel  for 
contention-based  MAC  protocols. 

In  Fig.  12,  we  compare  the  average  waiting  time  of  data  packets. 
Observe  that  our  ASCEMAC  has  about  56.178%  shorter  waiting  time 
than  TRAMA,  and  about  8.648%  shorter  waiting  time  than  S-MAC. 
Moreover,  in  this  experiment,  we  set  Wmax  to  12seconds.  We  found 
that  the  average  waiting  times  for  ASCEMAC  are  smaller  than  Wm ax 
even  at  different  clock  drift  rates.  But  for  S-MAC  and  TRAMA,  the 
average  waiting  times  are  longer  than  Wmax  when  the  clock  drift 
rate  is  bigger  than  O.Olms/s.  Hence,  ASCEMAC  is  more  sensitive  to 
the  traffic  lifetime  requirement  than  S-MAC  and  TRAMA. 

In  Fig.  13,  we  plot  the  average  clock  drift  rate  versus  the  successful 
transmission  rate.  It  can  be  seen  that  our  ASCEMAC  outperforms  the 
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Fig.  11.  Average  Energy  Consumption 


Fig.  12.  Average  Waiting  Time 


S-MAC  for  about  12.5%  higher  data  successful  transmission  rate. 
There  are  almost  same  successful  transmission  rate  for  ASCEMAC 
and  TRAMA.  But,  the  energy  saving  and  waiting  time  performance 
for  ASCEMAC  is  much  better  than  those  for  TRAMA. 


Fig.  13.  Successful  Transmission  Rate 


•  Saving  energy  at  MAC  layer  through  reducing  the  energy 
consumption  on  collision  and  idle,  and  trading  off  data  waiting 
time; 

•  Utilizing  free-running  scheme  and  schedule  broadcast  to  set 
up  phase-switching  schedules  without  establishing  global  clock 
within  a  cluster; 

»  Exploiting  a  rescheduling  method,  instead  of  time  synchro¬ 
nization,  to  handle  mismatch  caused  by  clock  drifts,  as  well 
as  taking  advantage  of  fuzzy  logical  theory,  which  has  dis¬ 
tinctive  capabilities  for  coping  with  uncertainty,  to  act  as  our 
rescheduling-FLS; 

•  Designing  a  time-slot  allocation  system,  allocation-FLS,  based 
on  traffic  intensity  and  unsuccessful  transmission  rate; 

•  Proposing  a  traffic  intensity  and  network  density-based  model 
to  acquire  optimal  power  on/off  duration,  interval  of  schedule 
broadcast,  super-time-slot  length  and  order. 

Simulation  results  show  that  our  algorithm  successfully  acquire 
the  optimum  values  of  essential  algorithm  parameters  to  ensure  the 
average  successful  transmission  rate,  decrease  the  data  packet  average 
waiting  time,  and  reduce  the  average  energy  consumption.  Therefore, 
network  performance  is  improved  and  network  lifetime  is  extended 
by  using  our  algorithm. 
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IV.  Conclusion 

In  this  paper,  we  propose  a  novel  energy-efficient  MAC  proto¬ 
col  for  wireless  sensor  nestworks:  ASCEMAC.  ASCEMAC  makes 
following  contributions,  compared  to  existing  energy-efficient  MAC 
protocols: 
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Abstract —  Query  processing  methods  have  been  studied  exten¬ 
sively  in  the  context  of  database  systems.  But  they  are  not  directly 
applicable  in  sensor  database  systems  due  to  the  characteristics 
of  sensor  networks:  the  decentralized  nature  of  sensor  networks, 
the  limited  computational  power  and  energy  scarcity  of  individual 
sensor  node,  and  imperfect  information  recorded.  In  this  paper, 
we  propose  an  energy-efficient  query  optimization  algorithm 
(QOA)  for  imperfect  information  in  sensor  database  systems. 
We  employ  an  in-network  query  processing  method,  which  tasks 
sensor  networks  through  declarative  queries.  Given  a  query, 
our  QOA  will  generate  an  energy  efficient  query  plan  for  in- 
network  query  processing.  Moreover,  our  algorithm  can  explicitly 
exposes  uncertainty  and  ambiguity  of  query  results  to  database 
users.  As  we  know,  it  is  troublesome  or  even  impossible  to 
keep  a  large  number  of  data  in  sensor  database  systems  for 
network  resource  constraints.  In  our  algorithm,  we  formulate 
the  probability  distribution  functions  (PDFs)  of  measurement 
uncertainties  according  to  the  knowledge  on  observation  coverage 
and  devices  utilized,  instead  of  estimating  them  from  prior  data. 
The  simulation  results  demonstrate  that  our  algorithm  can  vastly 
reduce  resource  usage  and  thus  extend  the  lifetime  of  sensor 
database  system. 

I.  Introduction 

Recent  developments  in  integrated  circuit  technology  have 
allowed  the  construction  of  low-cost  small  sensor  nodes  with 
signal  processing  and  wireless  communication  capabilities. 
Distributed  wireless  sensor  networks  (WSNs),  which  are  ba¬ 
sically  composed  of  sensor  nodes  through  ad  hoc  networking, 
have  increasing  potential  applications.  WSNs  have  been  ap¬ 
plied  in  military  sensing,  physical  security,  air  traffic  control, 
environment  monitoring  and  structures  monitoring[l],  etc.. 

From  a  data  storage  point  of  view,  WSN  can  be  regarded 
as  a  distributed  database,  sensor  database  system  (SDS).  Each 
node  in  a  WSN  takes  time-stamped  measurements  of  physical 
phenomena  such  as  heat,  sound,  light,  pressure,  or  motion. 
SDSs,  compared  to  traditional  database  systems,  store  the 
data  within  the  network  and  allow  queries  to  be  injected 
anywhere  in  the  network.  Data  distribution  along  with  data 
replication  makes  the  entire  system  more  robust  to  failures  and 
can  provide  increased  bandwidth  and  throughput,  as  well  as 
greater  data  availability.  But,  this  distributed  nature  makes  the 
query  processing  significantly  harder  for  query  optimization. 

Moreover,  WSNs  are  often  developed  to  run  unattended  for 
years.  This  calls  for  not  only  robust  hardware  and  software,  but 
also  lasting  energy  resources.  However,  current  sensor  nodes 


are  usually  powered  by  limited  batteries  [2],  and  replacing 
or  recharging  batteries,  in  many  cases,  may  be  impractical 
or  uneconomical.  A  recent  study  [11]  shows  that,  in  data 
collection  application,  about  40%  of  energy  consumption  is 
due  to  communication  and  58%  is  due  to  sensing.  Therefore,  in 
devising  the  best  overall  execution  plan,  data  queries  designed 
for  SDSs  should  be  highly  efficient  and  optimized  in  terms  of 
energy  on  communication  and  data  sensing. 

Data  aggregation  [2]  techniques  have  been  investigated 
recently  as  efficient  approaches  to  achieve  significant  energy 
reservation  in  SDSs.  The  main  idea  of  data  aggregation  is 
that  aggregation  points  combine  data  arriving  from  different 
sensor  nodes,  eliminate  redundancy,  and  minimize  the  number 
of  transmissions  before  forwarding  data  to  the  base  station. 
Some  important  existed  works  are  shown  in  [7],  [8],  [9], 
etc..  But,  in  this  paper,  we  exploring  a  new  approach,  disk 
covering  method,  to  reduce  the  information  redundance  on 
communication  and  sensing  for  energy  reservation. 

Another  target  of  this  paper  is  about  how  to  reason  query 
uncertainty  from  imperfect  information  in  SDSs.  “Imperfect 
information  is  ubiquitous-almost  all  the  information  that  we 
have  about  the  real  world  is  not  certain,  complete  and  precise” 
[10],  These  include  examples  such  as  measurement  and  record¬ 
ing  errors,  missing  data,  incompatible  scaling,  obsolescence, 
and  data  aggregation.  Therefore,  such  imperfection  is  a  fact 
in  database  systems.  Nowadays,  more  and  more  database 
designers  switch  to  study  the  whole  problem,  including  certain 
information  and  uncertain  information,  to  more  accurately 
describe  the  real  world  through  database  systems. 

An  extensive  survey  of  the  work  done  in  the  database  and 
artificial  intelligence  communities  on  imperfect  information 
is  given  in  [12].  It  points  out  that  in  order  to  build  useful 
information  systems,  it  is  necessary  to  learn  how  to  represent 
and  reason  with  imperfect  information.  Notice  that  uncertain 
information  is  typically  handled  by  attaching  a  number,  which 
represents  a  subjective  measure  of  the  certainty  of  the  uncer¬ 
tain  element  according  to  some  observer.  The  way  in  which  the 
number  is  manipulated  depends  on  the  theory  that  underlies 
the  number.  There  are  possibilistic,  probabilistic  and  fuzzy 
approaches  [13]  [14]  [15]  [16]  [17],  Most  of  them  just  consider 
how  to  represent  uncertain  information  in  the  database  system, 
and  how  to  make  relational  calculus  among  relations.  But,  how 
to  reason  uncertainty  with  imperfect  information  is  seldom 


studied. 

In  this  paper,  we  propose  an  energy-efficient  query  opti¬ 
mization  algorithm  (QOA)  for  imperfect  information  in  SDSs. 
We  employ  an  in-network  query  processing  method,  which 
tasks  sensor  networks  through  declarative  queries.  Given  a 
query,  our  algorithm  will  generate  an  energy  efficient  query 
plan  for  in-network  query  processing.  The  optimized  query 
plan  can  vastly  reduce  resource  usage  and  thus  extend  the  life¬ 
time  of  SDSs.  Moreover,  our  algorithm  can  explicitly  exposes 
uncertainty  and  ambiguity  of  query  results  to  database  users. 
As  we  know,  it  is  troublesome  or  even  impossible  to  keep  a 
large  number  of  data  in  sensor  database  systems  for  network 
resource  constraints  and  environment  uncertainties.  In  our 
algorithm,  we  manage  uncertainties  using  probability  theory 
as  in  [6]  and  [5],  but  the  probability  distribution  functions 
(PDFs)  of  measurement  uncertainty  are  formulated  according 
to  the  knowledge  on  observation  coverage  and  devices. 

The  remainder  of  this  paper  is  organized  as  follows:  in 
Section  II,  we  provide  some  preliminaries  on  vector  space 
model  and  k-partial  set  cover  problem;  Section  III  presents 
our  algorithm;  simulation  results  are  given  in  Section  IV; 
Section  V  concludes  this  paper. 

II.  Preliminaries 
A.  Vector  Space  Model 

Vector  Space  Model  (VSM)  [18]  [19]  is  a  way  to  represent 
documents  through  the  words  that  they  contain.  It  has  been 
widely  used  in  the  traditional  information  retrieval  (IR)  field 
[20]  [21],  Most  search  engines  also  use  similarity  measures 
based  on  this  model  to  rank  Web  documents.  VSM  creates  a 
space  in  which  both  documents  and  queries  are  represented  by 
vectors.  For  a  fixed  collection  of  documents,  an  m-dimensional 
vector  is  generated  for  each  document  and  each  query  from 
sets  of  terms  with  associated  weights.  Then,  a  vector  similarity 
function,  such  as  the  inner  product,  can  be  used  to  compute 
the  similarity  between  a  document  and  a  query. 

In  VSM,  weights  associated  with  the  terms  are  calculated 
based  on  the  following  two  numbers: 

•  term  frequency,  fy,  the  number  of  occurrence  of  term  yt 
in  document  xp,  and 

•  inverse  document  frequency,  gl  =  log(N/dj),  where  N 
is  the  total  number  of  documents  in  the  collection  and  dj 
is  the  number  of  documents  containing  term  j/j. 

The  similarity  simvs(q,  Xi)  between  a  query  q  and  a  document 
Xi  can  be  defined  as  the  inner  product  of  the  query  vector  Q 
and  the  document  vector  Xy. 
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where  rrii  is  the  number  of  unique  terms  in  the  document 
collection.  Document  weight  Wij  and  query  weight  Vj  are 

Wij  —  fijWij  —  fijlog{N/dj)  and 
log(N/dj)  yj  is  a  term  in  q 

0  otherwise. 


B.  k-Partial  Set  Cover  Problem 

Covering  problems  are  widely  studied  in  discrete  optimiza¬ 
tion.  Basically,  these  problems  involve  picking  a  least-cost 
collection  of  sets  to  cover  elements.  Classical  problems  in 
this  framework  include  general  set  cover  problems  and  partial 
covering  problems,  k-partial  set  cover  problem  [23]  as  a  partial 
covering  problem  is  about  how  to  choose  a  minimum  number 
of  sets  to  cover  at  least  k  elements,  and  which  k  elements 
should  be  chosen. 

k-partial  set  cover  problem  can  be  formulated  as  an  integer 
program  as  following. 

MINIMIZE: 
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SUBJECT  TO: 

Vi+  ^2  Xj>l  i  =  1, 2, . . . ,  n,  (4) 
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n 

53  2/i  <  n  -  fc,  (5) 
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Vi>  0  j  =  l,2,...,n,  (7) 


Where  £*£{0,1}  corresponds  to  each  Sj&S.  Iff  set  Sj  belongs 
to  the  cover,  then  Xj  —  1.  Iff  set  tj  is  not  covered,  then  yt  =  1. 
tiST.  Clearly,  there  could  be  at  most  n  -  k  such  uncovered 
elements. 

III.  Our  Algorithm  Description 

In  a  SDS,  when  a  query  is  submitted,  common  rules 
for  active  sensor  nodes  selection  is  generated  based  on  the 
query.  And  then,  each  sensor  node  determines  whether  itself 
will  participate  this  query  processing  or  not  according  to  its 
location,  remaining  energy  and  measurement  accuracy  through 
query  optimization  algorithm.  Finally,  the  chosen  sensor  nodes 
collect  data  and  send  them  back  to  the  sink  with  uncertainty. 
In  our  algorithm,  active  sensor  node  is  defined  as  a  sensor 
node  which  collects  data  and  makes  responds  during  a  query 
processing. 

A.  Network  Model 

In  SDS,  a  large  number  of  sensor  nodes  are  deployed  over 
an  area.  All  nodes  are  interconnected  to  one  or  more  gateways 
by  means  of  wireless  links.  Gateways  are  in  charge  of  relaying 
data  to  a  base  station. 

For  the  reasons  of  deployment  itself  of  SDSs,  it  is  difficult 
or  even  impossible  to  exactly  pre-determine  the  locations  of 
sensor  nodes.  After  all  sensor  nodes  have  been  deployed, 
each  node  sends  its  location  information  to  sinks  through 
certain  messages,  such  as  beacons.  The  topologies  of  the  area 
controlled  by  each  gateway  will  be  formed  according  to  these 
information.  Then,  all  topology  information  will  be  gathered 


at  the  base  station.  We  assume  that  each  sensor  node  in  our 
algorithm  is  capable  of  acquiring  its  own  location  through 
certain  methods. 

B.  Query  VSM  Design 

With  high  network  density  and  topology  un¬ 
predetermination  characteristics,  sensing  range  overlapping 
in  SDSs  among  nodes  are  unavoidable  and  space  variant 
(the  node  density  is  not  uniform  over  the  network).  It  is  the 
main  reason  to  create  redundancy  data.  Communicating  and 
storing  these  redundancy  data  is  one  of  the  biggest  sources  of 
wasting  energy  during  query  processing.  But  on  the  other  side 
of  coin,  it  is  a  method  to  increase  the  confidence  of  query 
results.  Therefore,  there  is  a  trade  off  between  increasing  the 
confidence  of  query  answer  and  saving  energy. 

We  solve  the  high  information  redundancy  through  control¬ 
ling  the  number  of  nodes  to  response  queries.  It  is  obviously 
that  the  less  the  number  of  nodes  communicate  and  sense  dur¬ 
ing  a  query,  the  less  the  energy  is  consumed.  But  the  problem 
is  the  query  results  supplied  by  partial  nodes  should  reflect 
the  whole  area’s  condition  at  acceptable  degree,  otherwise  it 
is  uselessness  for  database  users.  Therefore,  the  key  issue  is 
to  determine  how  many  nodes  and  which  nodes  should  be 
selected  for  a  query. 

Following  factors  are  considered  by  us  for  this  issue: 

•  Sensor  Location 

Since  sensors’  location  directly  determines  which  area 
can  be  observed.  Given  a  piece  of  area  and  some  nodes 
over  this  area,  in  order  to  employ  as  few  as  possible  nodes 
to  cover  as  large  as  possible  area,  we  should  select  those 
nodes  which  locate  optimum  locations.  We  discuss  how  to 
determine  optimum  locations  for  a  query  in  the  following 
part  (Optimum  Location  Determination  Section). 

•  Measurement  Accuracy 

Since  the  cost  and  the  measurement  accuracy  of  sensor 
nodes  are  related  with  each  other,  sensor  nodes  owning 
different  accuracy  levels  are  deployed  simultaneously  in 
a  SDS  for  economical  reason.  Furthermore,  through  a 
query,  database  users  supply  not  only  what  information 
they  want  to  retrieval,  but  also  the  requirement  on  uncer¬ 
tainties  of  query  results.  In  this  case,  we  should  select  the 
nodes  whose  measurement  accuracy  are  close  to  database 
users’  requirement. 

•  Battery  Level 

The  battery  level  of  sensor  nodes  is  our  third  factor  of 
nodes  selection.  When  the  power  of  a  node  is  used  up, 
the  data  observed  by  this  node  will  be  missed,  which  will 
reduce  the  confidence  at  some  degree.  This  inspires  us  to 
select  the  nodes  with  more  remaining  battery  so  that  a 
query  processing  can  be  completed  by  all  chosen  nodes. 

In  our  algorithm,  we  employ  VSM  to  combine  all  consid¬ 
ering  factors  to  select  the  most  related  nodes  to  participate 
a  query  processing.  Our  goal  is  to  use  as  little  energy  as 
possible  and  more  suitable  sensor  nodes  to  supply  satisfied 
query  results. 


In  our  query  VSM,  the  query  vector  is  designed  as 

(. Ri,Ad,Bm )•  Where 

•  Ri  stands  for  location  relativity.  It  is  the  indicator  of  the 
distance  between  the  location  of  a  sensor  node  and  the 
optimum  location.  If  their  positions  are  exactly  match,  in 
this  case,  i?;= 1. 

•  Ad  stands  for  measurement  accuracy.  It  is  the  indicator 
of  sensor  nodes’  measurement  accuracy.  Ad  equals  to 
the  probability  distribution  function  (PDF)  of  each  node’ 
measurement  error.  For  example,  the  measurement  accu¬ 
racy  of  CXM539  is  100 yT  (lmGauss).  In  this  case,  Ad= 
0.002. 

•  Bm  stands  for  remaining  battery.  It  is  the  indicator  on 
how  much  power  remains  for  a  sensor  node.  The  unit  of 
Bm  is  i J . 

After  a  database  user  submits  a  query  (shown  in  Fig.  1),  the 
base  station  selects  the  optimum  locations  of  this  query,  and 
then  translates  the  query  from  SQL  [4]  form  into  a  query  VSM 
vector  Q.  According  to  the  query  given  in  1,  Q=(l,  0.2, 5).  We 
assume  the  maximum  energy  for  nodes  is  5  J. 

SELECT  MINfTEMP),  MAX(TEMP),AVG(TEMP) 

FROM  nodes 

WHERE  LOCATION=location  1  AND  PROB1<0.2  AND  PROB2<0.2 
SAMPLE  PERIOD  100s; 

Fig.  1.  SQL  query 

After  receiving  Q,  each  node  starts  to  updates  its  own  query 
VSM  vector  (i.e.,  ht  (i=l,2,-  •  • ,  n)),  hi={Riti,  Adji,  Bm,i).  We 
assume  there  are  n  nodes  in  this  network  totally.  Ri}i  is  defined 
as: 

=  dEESHSEjEE.  and 

K 

Ri,i  =  {n,»,i>n,»,2i  •  ■  •  iT’i,t,i'}  (8) 

Where  (x*,  j/j)  is  the  location  of  node  i.  (xq,  yo)  is  position  of 
an  optimum  location.  K  is  the  uniform  factor,  which  ensure 
the  value  of  r^j  is  less  than  one.  We  assume  that  there  are 
v  optimum  locations  for  a  query. 

We  design  a  query  correlation  indicator  7  to  express  the  cor¬ 
relation  degree  between  each  node  and  a  query.  We  formulate 
7  in  (9). 

7 Q,hi  =  max{gj  •  hitj} 

3 

_  1  X  Rlti,j  ~f*  (1  Arf)  X  (1  —  X 

vw + (1  -  a# + mm + (i  -  4*)2 + m 

The  higher  7  is,  the  more  chance  this  sensor  node  take  part 
in  this  query  as  active  sensor  nodes. 

C.  Active  Sensor  Nodes  Choosing 

The  query  correlation  indicator  7 are  exchanged  be¬ 
tween  neighbors  (nodes,  which  are  only  one  hope  apart,  are 
neighbors  and  can  communicate  with  each  other  directly).  By 
employing  cooperation  among  nodes,  the  nodes,  which  own 


highest  query  correlation  degree  among  their  neighbors,  are 
picked  up  to  participate  this  query.  The  pseudo-codes  of  active 
sensor  nodes  choosing  algorithm  is  given  in  Fig.  2. 

//initial  the  covering  set 
C  ♦ —  null; 

//initial  the  uncovering  set,  N  is  the  closure  of  all  neighbors  of  a  node 
UC<-N; 

//select  the  active  nodes 
while  UC  is  not  null 
do 

select  node  i  with  the  highest  query  correlation; 
if  node  i  is  not  covered  yet 

C  ■*—  (i); 

UC<—  N/(i); 
else 

go  to  node  j  with  the  next  highest  query  correlation; 
end 

Fig.  2.  Algorithm  for  active  nodes  choosing 

Notice  that,  running  our  QOA  in  each  sensor  node,  the  most 
related  sensor  nodes  are  chosen  to  answer  the  query,  which  are 
mostly  close  to  the  optimum  locations,  satisfy  the  uncertainty 
requirement  and  own  high  battery  level.  But  other  nodes  switch 
into  energy  saving  mode,  i.e.,  sleep  mode.  Therefore,  the  SDS 
composed  by  these  optimized  sensor  nodes  can  highly  improve 
the  energy  efficiency. 

D.  Optimum  Location  Determination 

We  model  the  problem,  determining  optimum  locations  for  a 
query,  as  a  k-partial  set  cover  problem.  We  define  this  problem 
as  follows:  Let  n  be  the  number  of  all  sensor  nodes,  p  be  a 
given  positive  integer  such  that  p  <  n.  If  we  have  k  same  disks 
with  radius  r,  which  depends  on  the  sensing  range  of  sensor 
nodes,  the  k-partial  set  cover  problem  try  to  solve  whether 
k  disks  can  cover  at  least  p  nodes.  In  this  paper,  we  only 
consider  sensor  nodes  in  a  plane  (the  dimension  is  2).  This 
kind  of  k-partial  set  cover  problem  is  a  NP  problem. 

At  present,  all  known  algorithm  for  NP  problems  require 
time  that  is  exponential  in  the  problem  size.  It  is  unknown 
whether  there  are  any  faster  algorithms.  Therefore,  to  solve 
an  NP  problem  for  any  nontrivial  problem  size,  one  of  the 
approaches  is  approximation  algorithm,  which  can  acquire  the 
solution  during  polynomial  time.  SETCOVER  algorithm  [23] 
is  a  good  approximation  method  to  determine  the  value  of  k 
and  the  locations  of  these  k  disks  on  the  plane  we  interested. 

SETCOVER  “guesses”  the  set  with  the  highest  cost  in  the 
optimal  solution  by  considering  each  set  in  turn  to  be  the 
highest  cost  set.  For  each  set  that  is  chosen,  to  be  the  highest 
cost  set,  say  Sj  ,  Sj  along  with  all  the  elements  it  contains 
is  removed  from  the  instance  and  is  included  as  part  of  the 
cover  for  this  guess  of  the  highest  cost  set.  The  cost  of  all 
sets  having  a  higher  cost  than  c (Sj)  is  raised  to  oo.  Ij  = 
(TJ  ,5J,  c',  kj  )  is  the  modified  instance.  SETCOVER  then 
calls  PRIMALDUAL  on  Ij  which  uses  a  primal  dual  approach 
[24]  to  return  a  set  cover  for  Ij.  In  PRIMALDUAL,  the  dual 
variables  u;  are  increased  for  all  UeTj  until  there  exists  a 
set  Sa,  so  that  Yla-.t  es  Ul  =  c'(Sa)-  Sets  are  chosen  this 
way  until  the  cover  is  feasible.  The  algorithm  then  chooses 
the  minimum  cost  solution  among  the  m  solutions  found. 


After  the  value  of  k  and  the  locations  of  k  disks  are 
obtained,  in  our  algorithm,  we  choose  the  centers  of  those 
k  disks  as  our  optimum  locations.  Since  these  k  disks  can 
almost  cover  all  sensor  nodes  in  certain  area,  the  sensor  nodes, 
locating  on  these  locations,  can  almost  monitor  all  information 
of  the  interested  area. 

E.  Uncertainty  Acquisition 

There  are  numerous  factors  introducing  uncertainty  into  the 
query  results,  as  we  mentioned  above.  In  most  existed  works, 
the  uncertainties  of  query  results  are  determined  by  PDFs 
of  measurement  uncertainty,  which  are  pre-estimated  through 
history  data.  If  a  large  number  of  history  data  are  not  available, 
the  performance  of  those  existed  algorithms  will  become  worse 
or  even  they  cannot  deal  with  this  condition. 

As  we  know,  for  general  SDSs,  the  memory  size  of  sensor 
node  is  too  limited  to  keep  large  history  information  compared 
to  many  network  terminal  devices.  For  instance,  the  Berkeley 
motes  have  at  most  128KB  program  memory,  4KB  RAM,  and 
512KB  external  nonvolatile  storage  [2]. 

Inspired  by  this  demand,  we  formulate  the  PDFs  of  mea¬ 
surement  uncertainties  from  other  information  instead  of  his¬ 
tory  data.  Hereafter  we  analyze  the  main  factors  that  introduce 
the  uncertainty  into  the  query  results  in  order  to  formulate 
them.  These  include: 

•  Observation  Coverage 

Observation  coverage  is  the  area  covered  by  active  sensor 
nodes  during  a  query  processing.  Since,  the  physically 
observable  world  consists  of  a  set  of  continuous  phe¬ 
nomena  in  space,  it  is  impossible  to  gather  all  relevant 
data  through  nodes  whose  observation  coverages  are  not 
continue.  In  this  case,  some  uncertainties  are  introduced 
into  the  query  results  by  partial  observation  coverage.  We 
define  a  observation  coverage  PDF  (/c)  to  stand  the  total 
coverage  of  all  active  sensor  nodes  for  a  query. 

•  Measurement  Accuracy 

The  quality  of  sensor  node’s  sensing  parts  usually  boils 
down  to  its  measurement  stability  and  measurement  ac¬ 
curacy.  In  general,  as  measurement  stability  and  accu¬ 
racy  increase,  so  do  their  power  requirements  and  cost, 
which  are  all  troublesome  for  general  sensor  nodes.  For 
general  application,  different  cost  of  sensor  nodes  are 
deployed.  Hence,  some  uncertainties  are  introduced  into 
the  query  results  by  measurement  errors.  For  example, 
speed  detect  sensor  node,  CXM539,  is  a  built-in  magneto¬ 
resistive  sensor.  The  measurement  accuracy  is  100  pT 
(ilmGauss)  [22].  We  define  a  measurement  accuracy 
PDF  ( fm )  to  stand  the  measurement  error  produced  by 
related  sensor  nodes. 

We  employ  formula  (10)  to  calculate  the  corresponding 
uncertainty  for  a  query  result. 

p=  [  fm(x)dx  X  [  fm{y)dy  (10) 

J  <f)  J  Ip 

We  present  a  classification  of  probabilistic  queries  and 
examples  of  common  representative  uncertainty  for  each  class. 


There  are  different  definitions  for  fm  for  each  class. 

1)  simple  aggregation  class 

In  this  class,  an  value  of  an  sensor  node  is  returned  only, 
such  as  MIN  and  MAX  query.  In  this  case, 

Node  j  is  the  node  which  detects  the  highest/lowest 
value  during  this  query. 

2)  complicate  aggregation  class 

In  this  class,  a  derivative  value  over  a  group  of  sensor 
nodes’  data  is  returned,  such  as  AVG  query.  fm,  now, 
has  the  same  distribution  as  f,nj.  But  the  mean  and 
variance  are  \Pj  and  separately.  We 

assume  there  are  M  active  nodes  for  this  query.  And 
for  each  node,  fmj  complies  with  same  distribution  with 
different  mean  (fij)  and  variance  (<r. ,).  But,  if  M  is  big 
enough,  fm  complies  with  a  normal  distribution  [25]. 

For  example,  a  database  user  retrievals  the  highest,  lowest 
and  average  temperatures  of  location  1  and  location  2.  The 
query  results  is  given  in  Table  I.  Relation  database  model  [3]  is 
employed.  TEMPERATURE  relation  (see  I),  is  specially  used 
to  record  the  temperature  information  of  interested  areas.  The 
values  of  PROB1,  PROB2  and  PROB3  are  calculated  using 
(10). 

TABLE  I 

Example  of  uncertain  relation:  TEMPERATURE.  probI  is  the 

UNCERTAINTY  WITH  THE  LOWEST  TEMPERATURE.  PROB2  IS  THE 
UNCERTAINTY  WITH  THE  HIGHEST  TEMPERATURE.  PROB3  IS  THE 
UNCERTAINTY  WITH  AVERAGE  TEMPERATURE. 


LOCATION 

MIN 

PROBI 

MAX 

PROB2 

MEAN 

PROB3 

location  1 

30 

7.5% 

50 

5.56% 

44.5 

10% 

location  2 

40 

4.22% 

55 

0.02% 

48.9 

15.89% 

IV.  Simulations  and  Performance  Evaluation 

One  hundred  sensor  nodes  are  deployed  randomly  in  an 
area  of  10  x  10m2,  and  sensing  range  is  lm.  The  initial 
energy  of  sensor  nodes  uniformly  distributes  within  [0,5]J. 
We  run  Monte  Carlo  simulations  1000  times  to  remove  the 
randomicity  of  simulation  results.  We  compare  our  QOA 
against  original  query  processing  method  without  any  query 
optimization. 

The  energy  consumption  model  for  data  sensing  is  shown 
in  (11). 

Ese  =  Eeiec*St  (11) 

Where  Ese  is  the  energy  consumed  by  one  query  processing. 
Eeiec  is  the  energy  consumed  by  once  data  sampling.  St  is  the 
duration  of  one  query  processing,  which  is  defined  by  database 
users’  query.  In  this  paper,  we  choose:  Eeiec  =  bmp/ sample. 
In  Fig.  3,  we  plotted  the  sampling  index  versus  the  nodes 
dead  time.  We  can  see  that  after  processing  about  20  times 
of  sampling,  all  nodes,  without  QOA,  use  up  their  energy. 
But  for  QOA,  the  whole  network  is  not  down  until  53  times 


of  sampling.  Therefore,  QOA  extends  the  lifetime  of  network 
about  2.5  times. 


Fig.  3.  Nodes  Dead  Time 

In  Fig.  4,  we  compare  the  observation  covering  rate  of  these 
two  schemes.  Observed  that,  QOA  outperforms  the  original 
query  processing  method  for  about  employing  less  65  —  45  = 
20  nodes  to  cover  90%  area  interested. 


Fig.  4.  Observation  Coverage  Rate 

In  Fig.  5,  we  plot  the  nodes  selection  rate  for  QOA. 
Observed  that,  at  most  about  40%  nodes  are  chosen  for  a 
query  and  at  least  about  1 5%  nodes  are  active  nodes  to  respond 
a  query.  This  simulation  result  illustrate  the  reason  why  our 
QOA  can  implement  energy  reservation.  That  is,  about  half 
nodes  switch  to  energy  saving  model  during  query  processing. 

By  employing  our  QOA,  the  energy  is  saved  and  the  lifetime 
of  network  is  extended.  But  the  cost  for  using  our  algorithm 
is  the  decrease  of  observation  covering  rate.  In  Fig.  6,  we  plot 
the  observation  covering  rate  decreasing  degree.  It  is  shown 
that,  the  biggest  observation  covering  rate  decrease  is  16.6%. 
But,  most  of  time,  the  decreasing  degree  is  less  than  8%. 

V.  Conclusions 

In  this  paper,  we  propose  a  energy-efficient  query  optimiza¬ 
tion  algorithm  for  imperfect  information  in  sensor  database 
systems.  We  tasks  sensor  networks  through  declarative  queries. 


Fig.  5.  Nodes  Selection  Rate 


Fig.  6.  Decrease  of  Observation  Coverage  Rate 


Given  a  query,  our  query  optimization  algorithm  generates  an 
energy  efficient  query  plan  for  in-network  query  processing. 
Moreover,  our  algorithm  explicitly  exposes  uncertainty  and 
ambiguity  of  query  results  to  database  users.  We  formulate  the 
PDFs  of  measurement  uncertainties  according  to  the  knowl¬ 
edge  on  observation  coverage  and  devices  employed,  instead 
of  estimating  them  from  prior  data. 

The  simulation  results  prove  that  our  algorithm  can  vastly 
reduce  resource  usage  and  thus  extend  the  lifetime  of  sensor 
database  system. 
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Abstract  -  In  this  paper,  we  are  concerned  with  developing 
a  fuzzy  deployment  for  wireless  sensor  networks.  Traditional 
deployments  often  assume  a  homogeneous  environment,  which 
ignores  the  effect  of  terrain  profile  and  obstacles  such  as  build¬ 
ings,  trees  and  so  on.  Nevertheless,  in  many  applications,  some 
areas  need  to  be  more  critically  monitored.  All  these  factors 
are  combined  together  through  Fuzzy  Logic  System  in  our  pro¬ 
posed  scheme.  Simulation  results  show  that  the  Fuzzy  Deploy¬ 
ment  improves  the  worst-case  coverage  by  around  5  dB. 

Keywords  -  Deployment,  fuzzy  logic,  propagation  modeling. 


I.  INTRODUCTION 


In  this  paper,  we  are  concerned  with  developing  a  fuzzy 
deployment  for  wireless  sensor  networks  (WSN).  Traditional 
deployments  often  assume  a  homogeneous  environment  [1], 
which  ignores  the  effect  of  terrain  profile  and  obstacles  such 
as  buildings,  trees  and  so  on.  Such  approaches  have  proved  in¬ 
accurate  in  the  practice  of  cellular  networks.  In  fact,  many 
propagation  models,  based  on  theoretical  calculation  and/or 
empirical  data,  have  been  proposed  to  predict  path  loss  over 
irregular  terrain.  For  instance,  the  Longley-Rice  model  [2,3], 
also  known  as  the  ITS  irregular  terrain  model,  was  proposed 
to  predict  large-scale  median  transmission  loss  relative  to  free 
space  loss  over  irregular  terrain.  The  Longley-Rice  method 
operates  in  two  modes,  namely,  point-to-point  and  area  modes. 
Taking  a  similar  approach,  Durkin  et  al.  [4,5]  proposed  a  com¬ 
puter  simulator  to  predict  field  strength  contours  over  irregular 
terrain,  which  was  adopted  by  U.K.  JRC  for  the  estimation  of 
effective  mobile  radio  coverage  areas.  As  a  standard  for  sys¬ 
tem  planning  in  Japan,  Okumura’s  model  [6]  is  widely  used  for 
signal  prediction  in  urban  areas.  None  of  these  works  have  not 
been  taken  into  consideration  in  current  WSN  research. 

Nevertheless,  in  many  applications,  some  areas  need  to  be 
more  critically  monitored.  For  example,  if  there  is  a  road 
through  the  area  of  interest,  and  chances  are  that  targets  would 
follow  this  road,  then  it  would  be  advisable  to  deploy  more 
sensors  around  this  road.  In  this  paper,  we  utilize  Fuzzy  Logic 


System  to  combine  all  these  factors  together  to  achieve  a  better 
deployment. 

The  rest  of  this  paper  is  organized  as  follows.  Section  II 
introduces  the  preliminaries  that  our  research  is  based  on.  A 
Fuzzy  Deployment  scheme  is  proposed  in  Section  III  and  sim¬ 
ulations  are  given  in  Section  IV.  Section  V  concludes  this  pa¬ 
per. 


II.  PRELIMINARIES 


A.  Overview  of  Fuzzy  Logic  Systems 


Figure  1  shows  the  structure  of  a  fuzzy  logic  system 
(FLS)  [7].  When  an  input  is  applied  to  a  FLS,  the  inference  en¬ 
gine  computes  the  output  set  corresponding  to  each  rule.  The 
defuzzifer  then  computes  a  crisp  output  from  these  rule  output 
sets.  Consider  a  p-input  1 -output  FLS,  using  singleton  fuzzifi¬ 
cation,  center-of-sets  defuzzification  [8]  and  “IF-THEN”  rules 
of  the  form  [9] 

Rl :  IF  xi  is  Fj  and  x2  is  F2  and  •  •  •  and  xp  is  Ylp,  THEN  y  is 

G'. 


Assuming  singleton  fuzzification,  when  an  input  x'  = 
{x'j, . . .  ,x'p}  is  applied,  the  degree  of  firing  corresponding  to 
the  fth  rule  is  computed  as 


Fr\ (z'i )  * /fr'  04) *--*FPp (x'p)  =  04)  (1) 

where  *  and  T  both  indicate  the  chosen  /-norm.  There  are 
many  kinds  of  defuzzifiers.  In  this  paper,  we  focus,  for  illustra¬ 
tive  purposes,  on  the  center-of-sets  defuzzifier  [8].  It  computes 
a  crisp  output  for  the  FLS  by  first  computing  the  centroid,  cGi, 
of  every  consequent  set  G(,  and,  then  computing  a  weighted 
average  of  these  centroids.  The  weight  corresponding  to  the 
lih  rule  consequent  centroid  is  the  degree  of  firing  associated 
with  the  Ith  rule,  Tf=l\ipi  (x'),  so  that 


Et 


(2) 


where  M  is  the  number  of  rules  in  the  FLS. 


FUZZY  LOGIC  SYSTEM 


Fig.  1 .  The  structure  of  a  fuzzy  logic  system. 


B.  Coverage 

Grid-based  approaches  are  often  used  to  compute  the  cov¬ 
erage  provided  by  the  sensor  networks  [10, 1 1],  However,  for 
resolution  and  complexity  considerations,  Voronoi-based  ap¬ 
proaches  are  required  in  some  situations.  Thanks  to  its  prop¬ 
erty  that  the  Voronoi  vertexes  partitions  the  plane  into  a  set  of 
convex  polygons  such  that  all  point  inside  a  polygon  are  closest 
to  only  one  site,  it  has  been  widely  used  to  determine  the  best- 
and  worst-case  coverages  [12].  For  illustrative  purposes,  only 
worst-case  coverages  are  considered  in  this  paper.  Considering 
the  targets  as  sources  of  signals,  the  received  signal  strength 
can  be  found  by  subtracting  overall  path  loss  from  the  radiated 
power  plus  antenna  gains,  expressed  in  dB.  Thus,  assuming  the 
propagation  is  bidirectionally  symmetric,  the  coverage  can  be 
represented  by  the  overall  path  loss  observed  at  the  vertexes  of 
Voronoi  polygons. 

C.  Propagation  Model 

In  previous  work  [13],  general  long-distance  path  loss  mod¬ 
els  are  often  used,  which  assume  the  average  large-scale  path 
loss  is  expressed  as  a  function  of  distance  by  using  a  path  loss 
exponent,  n  [14]. 


PL(d)  =  PL(do)(£)n  (3) 

do 

where  n  is  the  path  loss  exponent,  which  indicates  the  rate  at 
which  the  path  loss  increases  with  distance,  d0  is  the  close- 
in  reference  distance,  which  is  determined  from  measurement 
close  to  the  transmitter,  and  d  is  the  distance  from  the  source  to 
the  receiving  point.  However,  the  propagation  often  takes  place 
over  irregular  terrain,  and  the  effect  of  terrain  profile  in  many 
cases  is  not  negligible.  Based  on  a  systematic  interpretation 
of  measurement  data  obtained  in  different  areas,  a  number  of 
propagation  models  are  developed  to  predict  signal  strength. 
For  example,  work  by  Walfisch  and  Bertoni  [15]  considers  the 
impact  of  rooftops  and  building  height  by  using  diffraction  to 
predict  average  signal  strength  at  street  level.  Since  the  rows  or 
blocks  of  buildings  are  viewed  as  diffracting  cylinders  lying  on 
the  earth  in  the  development  of  this  model,  it  is  also  applicable 
to  obstacles  such  as  trees,  shrubs  and  so  on.  In  this  model,  the 
pass  loss,  S,  is  a  product  of  three  factors,  namely,  P0,  Q2  and 
Pi,  which  is  due  to  free  space  path  loss,  the  reduction  in  the 


rooftop  signal  and  diffraction,  respectively.  When  expressed  in 
dB,  the  overall  path  loss  is  given  by 

Lp  =  Lq  +  Lex  (4) 

where  Lo  is  the  free  space  path  loss  and  Lex  is  the  excess  path 
loss  due  to  terrain  profile.  In  this  paper,  we  only  consider  area¬ 
mode  path  loss,  in  which  the  whole  area  of  interest  is  divided 
in  to  smaller  subareas  and  each  subarea  has  a  different  value  of 
Lex.  Although  this  is  a  simple  model,  it  satisfies  our  require¬ 
ment  of  accuracy.  More  accurate  model  could  be  used  at  the 
cost  of  complexity. 


III.  FUZZY  DEPLOYMENT 


A.  Problem  Formulation 

In  this  work,  a  simple  theoretical  propagation  is  used.  How¬ 
ever,  the  fuzzy  deployment  is  also  applicable  to  any  other  prop¬ 
agation  model.  First,  we  assume  the  area  of  interest  is  di¬ 
vided  into  square  subareas,  each  of  which  has  its  own  terrain 
profile  and  required  level  of  surveillance,  which  can  be  trans¬ 
lated  into  area  path  loss  PL(i)  and  required  threshold  of  path 
loss  PLrH{i )•  Second,  it  is  possible  to  control  the  number 
of  sensor  nodes  sprayed  in  each  subarea.  Third,  such  spray 
is  uniformly  random  in  each  subarea.  Then  the  problem  is  to 
determine  the  number  of  sensor  nodes  needed  in  the  ith  sub- 
area,  n(i).  Any  deployment  that  assumes  a  homogeneous  en¬ 
vironment  would  have  to  deploy  the  same  amount  of  sensors 
into  each  subarea,  which  can  not  meets  the  requirements  most 
likely.  From  common  sense,  we  know  that  it  would  advisable 
to  deploy  more  sensors  to  those  areas  with  larger  area  path  loss 
and  higher  level  of  surveillance,  though  such  relationship  is 
not  easy  to  determine.  However,  fuzzy  logic  systems  have 
demonstrated  their  power  in  utilizing  such  subjective  knowl¬ 
edge.  Therefore,  we  apply  fuzzy  logic  to  this  problem  to  deter¬ 
mine  the  best  number  of  sensors  for  each  subarea. 

B.  Scheme  description 

For  the  ith  subarea,  its  area  path  loss  PL(i)  and  required 
threshold  of  path  loss  PZ<T/r(*)  are  normalized  to  [0, 10]  by 
convention.  Either  antecedent,  PL(i )  or  PLth(1),  has  three 
overlapping  membership  functions  covering  the  whole  input 
space  as  shown  in  Fig.2.  The  overlaps  between  the  member¬ 
ship  functions  are  to  guarantee  that  more  rules  are  fired  for  a 
specific  input  so  that  the  decision  is  distributed  to  more  rules 
and  thus  robustness  is  improved.  Such  a  choice  of  membership 
functions  provides  M  =  32  =  9  rules. 

The  rules  are  designed  such  as: 

Rl :  IF  PL(i)  is  low  and  PLth^)  is  high, 

THEN  weight(i)  is  wl. 


Fig.  2.  Membership  functions  for  T(PL)  and  T(PLth)  - 
{LoiV)  Medium ,  High}. 


where  wl  is  an  integer  in  [1,9].  In  our  design,  a  different  wl  is 
used  for  each  of  the  M  rules.  The  whole  rule  base  is  shown  in 
Fig.3.  The  numbers  in  the  figure  are  the  values  of  whs  asso¬ 
ciated  with  the  respective  rules.  For  example,  the  upper-right 
cell  with  the  number  “9”  means: 

R 9  :  IF  PL(i )  is  high  and  PLth (0  is  high, 

THEN  weighty  is  9. 


where  rti/,  and  a  a  are  the  mean  and  standard  deviation,  respec¬ 
tively.  If  we  replace  the  triangle  and  trapezoidal  membership 
functions  in  Fig.2  by  the  Gaussian  membership  functions  in 
Fig.4,  the  number  of  free  parameters  is  45  as  listed  in  Table  2. 


Fig.  4.  Gaussian  Membership  functions  for  T(PL)  and  T(PLth)  ~ 
{Low,  Medium,  High}. 


0  L  M  H  PL 

Fig.  3.  Rule  base  for  the  Fuzzy  Deployment. 


Table  2.  Number  of  free  parameters  for  Gaussian  membership  functions  in 
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Finally,  for  the  ?'th  subarea,  the  FLS  computes  an  output 
weight(i )  according  to  (2).  Then  n(i)  is  determined  by  (6). 


n(i)  = 


weight(i) 
E,  weighty ) 


(6) 


Although  fuzzy  logic  systems  are  universal  approximators, 
the  desired  dynamic  can  only  be  captured  by  enough  free  pa¬ 
rameters.  Dividing  the  input  space  into  more  overlapping 
zones  can  give  us  finer  resolution  at  the  cost  of  higher  com¬ 
plexity,  thus,  the  number  of  design  parameters  is  a  measure¬ 
ment  of  both  resolution  and  complexity.  Our  design  has  30 
free  parameters  as  shown  in  Table  1. 

Table  1.  Number  of  free  parameters  in  the  Fuzzy  Deployment. 


Number  of  antecedent  parameters 

21 

Number  of  consequent  parameters 

9 

Total 

30 

However,  for  comparative  purposes,  the  number  of  free  pa¬ 
rameters  is  often  counted  when  all  the  membership  functions 
are  chosen  to  be  unnormalized  Gaussian  functions,  i.e., 

pA(x)  =  exp{-^X  },  (5) 

z  aA 


IV.  SIMULATIONS 


We  conducted  computer  simulations  to  compare  the  Fuzzy 
Deployment  with  traditional  uniform  deployment.  Specifi¬ 
cally,  we  consider  a  scenario  in  which  irregular  terrain  profile 
causes  variation  in  the  propagation.  A  1  km  x  1  km  area  is  di¬ 
vided  into  100  square  subareas  with  the  size  of  100m  x  100m, 
and  each  subarea  has  its  own  specific  terrain  profile.  Given 
1000  sensors  to  deploy  in  this  area,  a  traditional  deployment 
would  spray  1 0  nodes  into  each  subarea,  while  the  Fuzzy  De¬ 
ployment  would  adaptively  determine  the  number  needed  for 
each  subarea. 

We  ran  traditional  and  fuzzy  deployment  on  200  randomly 
generated  maps  and  took  the  average.  The  random  maps 
were  generated  as  follows.  For  the  zth  100m  x  100m  sub- 
area,  there  is  an  area  path  loss  PL(i )  and  a  path  loss  threshold 
PLth  (*) ,  which  represent  the  area  terrain  profile  and  required 


Fig.  5.  The  simulation  scenario.  The  parameter  pairs  (PL,  PLth)  are 
labeled  in  each  subarea. 


V.  CONCLUSION  AND  FUTURE  WORK 


In  this  paper,  a  new  fuzzy  deployment  is  presented  and  com¬ 
pared  to  traditional  ones  that  assume  homogeneous  environ¬ 
ment.  Although  the  simulation  results  show  a  5.71  dB  im¬ 
provement  in  the  worst-case  coverage,  the  power  of  the  fuzzy 
deployment  is  not  thoroughly  exploited.  As  shown  in  previ¬ 
ous  fuzzy  applications,  proper  training  often  betters  the  per¬ 
formance  of  fuzzy  logic  systems.  Generally,  a  back  propaga¬ 
tion  training,  also  referred  to  as  a  steepest  descent  algorithm, 
can  be  used  to  tune  all  the  free  parameters  enclosed  in  a  fuzzy 
logic  system.  However,  in  this  case  of  the  Fuzzy  Deployment, 
since  the  explicit  dependence  of  the  worst-case  coverage  on 
the  area  path  loss  and  the  required  threshold  is  unknown,  the 
gradient  could  be  elusive  to  our  knowledge.  In  this  case,  a  ran¬ 
dom  search  algorithm  could  be  used  to  optimize  the  coverage 
provided  by  the  Fuzzy  Deployment. 


level  of  surveillance,  respectively.  As  shown  in  Fig.5,  PL(i) 
and  PLth{ i)  are  uniformly  distributed  in  [4dB,10dB]  and 
[70dB,85dB],  respectively.  The  worst-case  coverage  was  de¬ 
termined  by  the  path  loss  observed  in  the  vertexes  of  Voronoi 
diagrams,  and  the  path  loss  was  calculated  according  to  Section 
II.  C.  Note  that  the  path  loss  is  defined  as  the  difference  be¬ 
tween  the  effective  transmitted  power  and  the  received  power 
and  thus  takes  on  positive  values,  the  lower  path  loss  indicates 
better  coverage  in  our  experiments. 

Table  3  shows  that  the  Fuzzy  Deployment  achieves  5.71dB 
improvement  in  the  worst-case  coverage.  Based  on  the  sim¬ 
ulations,  we  can  conclude  that  using  fuzzy  logic  to  find  the 
optimal  number  of  nodes  in  each  subarea  is  very  beneficial  be¬ 
cause  the  path  loss  observed  at  the  worst  points  is  reduced, 
which  typically  means  better  coverage  for  the  whole  network. 
It  is  also  worth  pointing  out  that  these  results  are  not  the  best 
performance  achievable  by  the  Fuzzy  Deployment,  because  all 
the  rules  used  in  Fuzz  Deployment  are  all  extracted  from  lin¬ 
guistic  information  so  far.  For  example,  the  assignment  of  the 
consequent  parameters,  i.e.  wl’s,  is  quite  arbitrary  and  intu¬ 
itive,  which  means  the  choices  in  Fig.3  are  not  necessarily  the 
best.  It  is  also  possible  to  extract  rules  from  representative  data 
collected  from  real  scenarios,  such  training  often  improves  the 
performance  of  a  fuzzy  logic  system  by  orders  of  magnitude. 
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Abstract — Previous  research  shows  that  restraining  duster 
size  helps  energy  efficiency  in  sensor  networks.  However,  it  is 
often  ignored  that  the  distance  estimation  in  sensor  networks 
is  inaccurate  enough  for  fine-grained  clustering  decision.  In 
this  paper,  we  are  concerned  with  developing  a  fuzzy  cluster 
size  to  handle  the  distance  error  and  non-linearity.  A  fuzzy 
logic  system  is  developed  to  make  clustering  decision  based  on 
the  received  signal  strength.  Simulation  results  show  that  the 
proposed  Fuzzy  Cluster  Size  scheme  can  keep  the  performance 
near  the  optimal  range  when  distance  estimation  is  distorted  by 
log-normal  shadowing. 

I.  Introduction 

Recent  technological  advances  have  made  it  possible  to 
develop  distributed  sensor  networks  consisting  of  a  large 
number  of  low-cost,  low-power,  and  multi-functional  sensor 
nodes  that  communicate  in  short  distances  through  wireless 
links  [1],  Example  applications  of  sensor  networks  include 
target  tracking,  scientific  exploration,  and  data  acquisition  in 
hazardous  environments. 

Wireless  sensors  provide  a  clear  advantage  in  cost,  size, 
flexibility  and  distributed  intelligence  over  their  wired  coun¬ 
terparts.  However,  the  energy  constraint  remains  a  major 
concern  for  wireless  sensor  networks  [2]— [4].  Clustering  has 
been  proposed  and  heavily  studied  for  improvement  in  energy 
efficiency  [5],  [6].  [7]  shows  that  clustering  should  done  with 
cluster  size  constraint,  however,  this  cluster  size  constraint 
come  in  form  of  fixed  cluster  radius,  which  could  be  insuffi¬ 
cient  to  model  the  complexity  of  clustering  in  sensor  networks. 
Furthermore,  the  distance  estimation  in  sensor  networks  is 
inaccurate  enough  for  fine-grained  clustering  decision.  In  this 
paper,  we  are  concerned  with  developing  a  fuzzy  cluster  size 
to  handle  the  distance  error  and  non-linearity  in  the  clustering. 

The  rest  of  this  paper  is  organized  as  follows.  Section  II 
introduces  the  preliminaries  that  our  research  is  based  on.  The 
problem  with  fixed  cluster  size  is  discussed  in  Section  III. 
A  Fuzzy  Logic  System  is  developed  to  make  the  clustering 
decision  in  Section  IV  and  simulations  are  given  in  Section 
V.  Section  VI  concludes  this  paper. 

II.  Preliminaries 
A.  Overview  of  Fuzzy  Logic  Systems 

Figure  1  shows  the  structure  of  a  fuzzy  logic  system 
(FLS)  [8].  When  an  input  is  applied  to  a  FLS,  the  inference 
engine  computes  the  output  set  corresponding  to  each  rule. 
The  defuzzifer  then  computes  a  crisp  output  from  these  rule 


output  sets.  Consider  a  p-input  1 -output  FLS,  using  singleton 
fuzzification,  center-of-sets  defuzzification  [9]  and  “IF-THEN” 
rules  of  the  form  [10] 

Rl  :  IF  xi  is  Fj  and  x 2  is  Fl2  and  ■  •  •  and  xp  is  FL  THEN  y 

is  G*. 


Assuming  singleton  fuzzification,  when  an  input  x'  = 
{x\, . . .  ,x'p}  is  applied,  the  degree  of  firing  corresponding  to 
the  Zth  rule  is  computed  as 


Mf'  K)  *  Mf'  (4)  *  ‘ *  Vf‘p(x'p)  =  T?=1nfli (a:')  (1) 


where  *  and  T  both  indicate  the  chosen  f-norm.  There  are 
many  kinds  of  defuzzifiers.  In  this  paper,  we  focus,  for  illustra¬ 
tive  purposes,  on  the  center-of-sets  defuzzifier  [9].  It  computes 
a  crisp  output  for  the  FLS  by  first  computing  the  centroid,  cGi , 
of  every  consequent  set  G*,  and,  then  computing  a  weighted 
average  of  these  centroids.  The  weight  corresponding  to  the 
Zth  rule  consequent  centroid  is  the  degree  of  firing  associated 
with  the  Zth  rule,  7?  j/xF!  (a:[),  so  that 


Vcos  (x')  = 


E^cG,7£l/MK) 


(2) 


where  M  is  the  number  of  rules  in  the  FLS. 


FUZZY  LOGIC  SYSTEM 


Fig.  1.  The  structure  of  a  fuzzy  logic  system. 


B.  Ranging  Techniques 

Distance  is  often  estimated  based  on  received  signal 
strength,  time  of  arrival(TOA),  time  difference  of  ar- 
rival(TDOA)  or  angle  of  arrival  [11],  The  angle-of-arrival 
based  ranging  requires  directive  antennas  or  arrays,  which  is 
not  suitable  for  most  microsensors.  Similarly,  measuring  time 
of  flight  requires  timing  device  with  satisfactory  resolution  like 
in  GPS.  Although  TDOA  needs  much  less  resolution,  it  often 
requires  extra  acoustic  or  ultrasound  emission,  which  comes 


with  higher  price,  larger  size  and  more  energy  consumption,  all 
seeming  impractical  for  microsensors.  Thus,  most  technically 
available  ranging  is  based  on  received  signal  strength;  in  fact, 
RSSI(Received  Signal  Strength  Indication)  is  widely  used  in 
wireless  communications  to  provide  distance  estimation.  The 
underlying  observation  is  that  the  average  large-scale  path  loss 
can  be  expressed  as  a  function  of  distance  by  using  a  path  loss 
exponent,  n  [12]. 

PL(d)  =  PL(do)(~)n  (3) 

a  o 

where  n  is  the  path  loss  exponent,  which  indicates  the  rate  at 
which  the  path  loss  increases  with  distance,  do  is  the  close- 
in  reference  distance,  which  is  determined  from  measurement 
close  to  the  transmitter,  and  d  is  the  distance  from  the  source 
to  the  receiving  point.  Measurements  have  also  shown  that  at 
any  value  of  d,  the  path  loss  PL{d)  at  a  particular  location  is 
random  and  distributed  log-nomally  (normal  in  dB)  about  the 
mean  distance-dependent  value. 

PL{d)[dB]  =  PL(d)[dB]  +  Xa,  (4) 

where  Xa  is  a  zero-mean  Gaussian  distributed  random  variable 
(in  dB)  with  standard  deviation  a  (also  in  dB).  The  log-normal 
shadowing  is  the  main  source  of  distance  error  for  received- 
signal-strength-based  ranging  methods.  The  values  of  n  and 
a  are  often  estimated  empirically,  for  example,  n  could  vary 
from  2  to  10  for  different  environments,  and  typical  value  of 
a  in  urban  area  is  around  10  dBs. 


Considering  the  phenomenon  of  interest  as  a  random  pro¬ 
cess,  the  correlation  between  data  collected  by  two  sensors 
is  generally  a  decreasing  function  of  the  distance  r  between 
them.  After  the  data  aggregation  removes  most  of  the  redun¬ 
dancy,  the  residue  can  be  assumed  an  increasing  function  of 
r.  Based  on  the  above  observation,  the  data  aggregation  effect 
is  modeled  as  below. 

Suppose  there  are  M*  non-head  members  in  cluster  k  (k  = 
1,2,3, c),  the  ith  member  ( i  =  1,2,3,  ...,M*,)  collects  l 
bits  and  sends  them  back  to  its  head  k  at  distance  r^i,  the 
head  expends  2 IEDA  Joules  on  the  data  aggregation  of  the 
21  bits  (l  bits  collected  by  itself  and  another  l  bits  by  its  ith 
member),  where  Eda  is  set  as  5 nJ/bit  as  in  [5]  and  listed 
in  Table  I.  The  resulting  data  is  assumed  of  Z(1  +  %;)  bits, 
where  T)ki  is  data  aggregation  residue  ratio  and  assumed  to  be 
complementary  exponential,  specifically, 

Vki  =  l  —  e~arki,0  <  a  <  1,  (7) 

where  a  is  a  small  positive  real  number  whose  magnitude 
depends  on  specific  phenomenon  of  interest.  For  example, 
the  light,  acoustic,  seismic  and  thermal  signals  often  show 
a  strong  correlation  at  short  distance,  and  thus,  a  will  have 
smaller  values  for  such  data.  Since  r]  is  a  monotonically 
decreasing  function  of  a  and  r,  rj  approaches  zero  for  smaller 
a  and  r.  This  model  can  approach  the  perfect-data-correlation 
assumption  in  [5]  by  decreasing  a  or  approach  the  no-data- 
aggregation  assumption  in  [15],  [16]  by  increasing  a,  thus, 
different  scenarios  can  easily  be  set  up  by  varying  a. 


C.  Radio  Energy  Consumption 

The  following  model  is  adopted  from  [5]  where  perfect 
power  control  is  assumed.  To  transmit  l  bits  over  distance 
d,  the  sender’s  radio  expends 


Erx{l,d) 


lEciec  -t~  Itfsd  d  <c  do 
lEelec  ~F  Rjnpd  d  P  do 


and  the  receiver’s  radio  expends 


(5) 


Eltx{l,d)  =  lEeiec.  (6) 


Eeiec  is  the  unit  energy  consumed  by  the  electronics  to  process 
one  bit  of  message,  e/s  and  eTOp  are  the  amplifier  factor 
for  free-space  and  multi-path  models,  respectively,  and  do  is 
the  reference  distance  to  determine  which  model  to  use.  The 
values  of  these  communication  energy  parameters  are  set  as 
in  Table  I. 


D.  Data  Correlation  Model 

The  data  collected  by  neighboring  sensors  have  a  lot  of 
redundancy,  thus,  [5]  assumes  perfect  data  correlation  that 
all  individual  signals  from  members  of  the  same  cluster  can 
be  combined  into  a  single  representative  signal.  Nevertheless, 
this  assumption  cannot  hold  when  the  cluster  size  increases 
to  some  extent.  Therefore,  we  develop  a  complementary 
exponential  data  correlation  model  based  on  the  observations 
in  distributed  data  compression  [13],  [14], 


III.  Fixed  Cluster  Size 

Expellant  Self-Organization  (ESO)  was  proposed  to  replace 
the  problematic  random  election  in  LEACH.  ESO  used  an 
individual  clustering  criterion  (8)  to  distribute  the  clustering 
decision  to  each  sensor  node.  That  is 

CH 

Jcm{P)  ^  Jch{i),  (8) 

CM 

where  Jcm{1)  (and  JcH{i ))  is  the  energy  cost  if  the  ith 
node  chooses  to  be  a  cluster  member  (and  cluster  head), 
respectively.  If  we  substitute  the  data  correlation  and  energy 
consumption  model  into  this  criterion,  we  obtain 

Eeiec  T  £fsrik  "F  Ee ;ec  +  El) A  +  Ijij'ik') {Eflcr  +  Cmpd^)  (9) 

CH  4 

^  Ed  A  +  Eeiec  + 

CM 

The  non-linearity  in  (9)  makes  it  difficult  to  evaluate  in  real 
application.  Thus,  an  easier  evaluation  is  needed.  Note  that  r 
is  the  dominating  factor  in  this  comparison,  (9)  reduces  to 

2  4  CH  4 

£fsrik  "F  t]{rik){Eelec  "F  ^mpdf-)  ^  —  Eeiec. 

CM 

Suppose  there  exists  a  solution  for  r,  which  is  denoted  by 
Rc(d)dcfi).  Then, 

CH 

r  ^  Rc(d,dch)  (10) 

CM 
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Rc(d,  dCh)  can  be  determined  analytically  or  empirically.  In 
[7],  Rc(d,dch)  is  simplified  into  a  constant  Rc  in  order  to 
fit  in  the  limited  computational  capacity  of  sensors.  FLS  is 
especially  useful  here  because  it  can  do  non-linear  mapping 
using  only  linear  computations. 

The  above  derivation  indicates  that  the  clustering  decision 
is  mainly  based  on  the  distance,  and  the  distance  is  often 
estimated  with  error.  In  this  paper,  we  consider  the  ranging 
error  with  received  signal  strength  because  it  is  most  widely 
used.  The  log-normal  shadowing  could  distort  the  clustering 
decision  dramatically  if  it  is  not  taken  care  of.  In  the  next 
session,  we  design  a  Fuzzy  Logic  System  to  address  this 
problem. 

IV.  Fuzzy  Cluster  Size 

Since  the  self-organization  details  are  described  in  [7],  we 
concentrate  on  the  clustering  decision  making.  Consider  a 
node,  [€],  which  is  making  it  clustering  decision,  i.e.,  to  be 
a  cluster  member  or  a  cluster  head.  Its  RSSI  meter  can  give  it 
a  RSSI  reading  from  the  base  station,  RSSI.d,  and  from  the 
neighboring  cluster  head  [fc],  RSSI.r.  The  cluster  head  [k] 
also  has  it  RSSI  reading  from  the  base  station,  RSSI.d.ch, 
which  is  available  to  node  [i]  through  local  broadcast.  Based  on 
these  three  parameters,  the  FLS  gives  out  a  resulting  ’‘support” 
from  this  node  to  the  neighboring  cluster  head,  which  indicate 
the  degree  to  which  this  node  should  join  the  neighboring 
cluster  head  as  a  cluster  member.  If  the  support  is  above  the 
threshold  zero,  then  this  node  should  join  the  neighboring 
cluster  head;  otherwise,  it  should  claim  itself  as  a  cluster  head. 
The  whole  process  is  depicted  in  Fig.2. 


Fig.  2.  System  diagram. 


The  rules  are  designed  such  as: 

Rl  :  IF  RSSI.r  is  Low  and  RSSI-d  is  High  and 
RSSI -d-ch  is  Medium 
THEN  support  is  wl . 

where  u>l  is  a  real  number. 

Three  Gaussian  membership  functions  are  used  for  each 
antecedent,  and  a  constant  wl  is  assigned  to  each  rule.  The 
Gaussian  membership  function  is  given  by 

pA(x)  =  exp{-^ — (11) 
z  oA 

where  and  oa  are  the  mean  and  standard  deviation, 
respectively.  Note  that  there  are  two  free  parameters  for  each 
Gaussian  membership  function  and  there  are  M  =  33  =  27 
rules,  there  are  2  x  3  x  3  =  18  antecedent  parameters  and  27 
consequent  parameters,  for  a  total  of  45  parameters.  These 
parameters  need  to  be  tuned  using  a  set  of  training  data. 
Another  set  of  data,  called  checking  data,  is  often  used  in 


training  to  prevent  overfitting,  which  can  be  observed  when 
checking  error  begins  to  increase  while  the  training  error  is 
still  decreasing.  The  antecedent  parameters  mA ’s  and  a  a' s  are 
tuned  using  back-propagation  while  the  consequent  parameters 
(w)l’s  are  determined  with  Least-Square  method  [17]. 

The  training  and  checking  data  are  collected  by  evaluating 
(9)  at  difference  step  size.  The  desired  support  is  defined  as 
the  difference  between  Jcm  and  Jch  so  that  the  support  is 
positive  when  Jcm  >  Jch-  The  initial  membership  functions 
are  equally  space  on  the  input  space,  for  example,  the  initial 
membership  functions  for  the  first  input,  RSSI.r ,  are  depicted 
in  Fig.3.  The  consequent  weight  wl  are  randomized.  And  the 
output  surface  of  the  trained  FLS  is  plotted  in  Fig.4. 


Fig.  3.  Example:  initial  membership  functions  for  RSSI.r. 


RSSLd 


Fig.  4.  Output  surface  of  the  trained  FLS.  RSSI.d.ch  fixed  at  70. 

The  support  is  used  in  two  ways.  Firstly,  when  a  node  is 
to  make  clustering  decision,  that  is,  to  be  a  cluster  head  or 
cluster  member,  it  should  count  the  support  from  its  potential 
supporters,  i.e.,  those  nodes  who  would  support  this  node  with 
positive  support  if  this  node  choose  to  be  a  cluster  head. 
The  sum  of  its  support  and  its  energy  level  are  the  basis 
of  this  node’s  clustering  decision.  Secondly,  after  some  node 
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successfully  becomes  cluster  head,  the  other  nodes  should 
consider  joining  this  cluster  head.  They  would  do  so  only  when 
they  find  themselves  supporting  this  cluster  head  with  positive 
support.  Those  with  negative  support  to  all  existing  cluster 
heads  should  consider  themselves  ’ ‘unclustered”,  and  try  to 
organize  themselves  into  clusters  through  ESO. 

V.  Simulations 

We  compared  the  performance  of  clustering  with  fuzzy  and 
fixed  cluster  size  using  computer  simulations.  100  nodes  with 
2J  initial  energy  were  evenly  distributed  in  a  circular  region 
with  diameter  100m,  and  the  base  station  was  located  at 
(125m,  0).  The  standard  deviation  of  log-normal  shadowing 
is  set  at  11.8dB.  The  exponent  a  of  data  correlation  model  is 
set  at  0.001.  The  communication  Energy  Parameters  are  set 
as  in  Table  tab:Pm.  We  ran  simulations  over  1000  random 
network  topologies  and  took  average  of  collected  data. 

TABLE  I 

Communication  Energy  Parameters 


Name 

Value 

do 

86.2m 

Eele.c 

50nJ/bit 

Ed  a 

5nJ  /bit 

if* 

WpJ /bit/m* 

fmp 

0.0013  pJ/bit/m* 

In  Fig.5(a),  (b),  (c)  and  (d),  the  amount  of  data  received 
at  the  base  station  over  time,  the  amount  of  data  received 
at  the  base  station  per  given  amount  of  energy,  the  number 
of  survival  nodes  over  time  and  the  umber  of  survival  nodes 
per  amount  of  data  received  in  the  base  station  are  plotted 
respectively  for  Rc  =  10, 30, 40, 80(m).  Fig.5(a)  and  (c)  show 
that  the  network  lifetime  is  maximized  at  Rc  —  40m,  (a)  and 
(c)  also  show  that  the  amount  of  data  delivered  is  maximized 
at  Rc  =  40m.  The  Data/Energy  Ratio,  indicated  by  the  slop 
in  (b),  also  reaches  its  maximum  at  Rc  =  40m.  However,  if 
Rc  is  not  set  at  this  optimal  values,  the  performance  degrades 
dramatically. 

The  performance  of  Fuzzy  Cluster  Size  is  plotted  in  Fig. 6 
and  compared  with  fixed  Rc  =  40m.  The  two  curves  in  all  four 
subfigures  are  very  close  to  each  other,  which  clearly  show  that 
fuzzy  cluster  size  could  always  keep  the  performance  near  the 
optimal  size.  When  confronted  with  distance  error,  this  feature 
can  guarantee  robust  results. 

VI.  Conclusion 

In  this  paper,  we  propose  using  fuzzy  cluster  size  to  address 
the  non-linearity  and  distance  uncertainty  in  clustering.  Thanks 
to  Fuzzy  Logic  System’s  power  in  handling  non-linearity  and 
uncertainty,  the  Fuzzy  Cluster  Size  scheme  keeps  the  clus¬ 
tering  performance  near  the  optimal  range  when  the  distance 
estimation  is  distorted  by  log-normal  shadowing. 

Acknowledgment 

This  work  was  supported  by  the  U.S.  Office  of  Naval  Re¬ 
search  (ONR)  Young  Investigator  Award  under  Grant  N00014- 
03-1-0466. 


References 

[1]  I.  F.  Akyildiz,  W.  Su,  Y.  Sankarasubramaniam,  and  E.  Cayirci,  “A  survey 
on  sensor  networks,”  IEEE  Commun.  Mag.,  vol.  20,  pp.  102-114,  Aug. 
2002. 

[2]  A.  Ephremides,  “Energy  concerns  in  wireless  networks,”  IEEE  Wireless 
Communications,  vol.  9,  no.  4,  pp.  48  -  59,  Aug  2002. 

[3]  V.  Raghunathan,  C.  Schurgers,  S.  Park,  and  M.  Srivastava,  “Energy- 
aware  wireless  microsensor  networks,”  IEEE  Signal  Processing  Maga¬ 
zine,  vol.  19,  no.  2,  pp.  40  -  50,  March  2002. 

[4]  R.  Min,  M.  Bhardwaj,  S.-H.  Cho,  N.  Ickes,  E.  Shih,  A.  Sinha,  A.  Wang, 

and  A.  Chandrakasan,  “Energy-centric  enabling  technologies  for  wire¬ 
less  sensor  networks,”  IEEE  Wireless  Communications,  vol.  9,  no.  4,  pp. 
28  -  39,  Aug.  2002.  , 

[5]  W.  B.  Heinzelman,  A.  P.  Chandrakasan,  and  H.  Balakrishnan,  “An 
application-specific  protocol  architecture  for  wireless  microsensor  net¬ 
works,”  IEEE  Trans.  Wireless  Commun.,  vol.  1,  no.  4,  pp.  660  -  670, 
Oct.  2002. 

[6]  J.-S.  Liu  and  C.-H.  Lin,  “Power-efficiency  clustering  method  with 
power-limit  constraint  for  sensor  networks,”  in  Proc.  of  the  2003  IEEE 
International  Conference  on  Performance,  Computing,  and  Communi¬ 
cations,  Apr  2003,  pp.  129  -136. 

[7]  L.  Zhao,  X.  Hong,  and  Q.  Liang,  “Energy-efficient  self-organization 
for  wireless  sensor  networks:  A  fully  distributed  approach,”  in  IEEE 
Globecom  ‘04.  Dallas,  TX:  IEEE,  Dec  2004. 

[8]  J.  M.  Mendel,  “Fuzzy  logic  systems  for  engineering  :  A  tutorial,”  in 
Proceedings  of  the  IEEE,  vol.  83,  no.  3,  March  1995,  pp.  345-377. 

[9]  - ,  Uncertain  Rule-Based  Fuzzy  Logic  Systems.  Upper  Saddle  River, 

NJ:  Prentice-Hall,  2001. 

[10]  E.  H.  Mamdani,  “Applications  of  fuzzy  logic  to  approximate  reasoning 
using  linguistic  systems,”  IEEE  Trans.  Syst.,  Man,  Cybern.,  vol.  26, 
no.  12,  pp.  1182-1191,  1977. 

[11]  J.  Hightower  and  G.  Bordello,  “Location  sensing  techniques,”  University 
of  Washington,  Department  of  Computer  Science  and  Engineering, 
Seattle,  WA,  UW  CSE  01-07-01,  July  2001. 

[12]  T.  S.  Rappaport,  Wireless  Communications: Principles  and  Practice. 
Upper  Saddle  River,  NJ:  Prentice-Hall,  2002. 

[13]  S.  Pradhan,  J.  Kusuma,  and  K.  Ramchandran,  “Distributed  compression 
in  a  dense  microsensor  network,”  IEEE  Signal  Processing  Mag.,  vol.  19, 
no.  2,  pp.  51-60,  Mar  2002. 

[14]  A.  Boulis,  S.  Ganeriwal,  and  M.  Srivastava,  “Aggregation  in  sensor  net¬ 
works:  an  energy-accuracy  trade-off,”  in  Proceedings  of  the  First  IEEE 
International  Workshop  on  Sensor  Network  Protocols  and  Applications, 
May  2003,  pp.  128-138. 

[15]  T.  Shepard,  “A  channel  access  scheme  for  large  dense  packet  radio 
networks,”  in  Proc.  ACM  SIGCOMM,  Stanford,  CA,  Aug.  1996,  pp. 
219-230. 

[16]  M.  Ettus,  “System  capacity,  latency  and  power  consumption  in  multihop- 
routed  ss-cdma  wireless  networks,”  in  Proc.  Radio  and  Wireless  Conf. 
(RAWCON’98),  Colorado  Springs,  CO,  Aug.  1998,  pp.  55-58. 

[17]  J.-S.  R.  Jang,  “Anfis:  Adaptive-network-based  fuzzy  inference  systems,” 
IEEE  Transactions  on  Systems,  Man,  and  Cybernetics,  vol.  23,  no.  3, 
pp.  665-685,  May  1993. 


4 
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Fig.  6.  Performance  of  clustering  at  different  fixed  Rc.  (a)  Amount  of  data  received  at  the  base  station  over  time,  (b)  Amount  of  data  received  at  the  base 
station  per  given  amount  of  energy,  (c)  Number  of  survival  nodes  over  time,  (d)  Number  of  survival  nodes  per  amount  of  data  received  in  the  base  station. 
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Abstract — In  this  paper,  we  introduce  a  new  method  for 
cross-layer  design  in  mobile  ad  hoc  networks.  We  use  fuzzy 
logic  system  (FLS)  to  coordinate  physical  layer,  data-link 
layer  and  application  layer  for  cross-layer  design.  Ground 
speed,  average  delay  and  packets  successful  transmission 
ratio  are  selected  as  antecedents  for  the  FLS.  The  output 
of  FLS  provides  adjusting  factors  for  the  AMC  (Adaptive 
Modulation  and  Coding),  transmission  power,  retransmis¬ 
sion  times  and  rate  control  decision.  Simulation  results 
show  that  our  cross-layer  design  can  reduce  the  average 
delay,  increase  the  throughput  and  extend  the  network 
lifetime.  The  network  performance  parameters  could  also 
keep  stable  after  the  cross-layer  optimization. 

I.  Introduction 

The  demand  for  energy  efficiency  and  Quality  of 
Service  (QoS)  in  mobile  ad  hoc  networks  is  growing 
in  a  rapid  speed.  To  enhance  the  energy  efficiency  and 
QoS,  we  consider  the  combination  of  physical  layer, 
data-link  layer  and  application  layer  together,  a  cross¬ 
layer  approach.  A  strict  layered  design  is  not  flexible 
enough  to  cope  with  the  dynamics  of  the  mobile  ad 
hoc  networks  [1].  Cross-layer  design  could  introduce 
the  layer  interdependencies  to  optimized  overall  network 
performance.  The  general  methodology  of  cross-layer 
design  is  to  maintain  the  layered  architecture,  capture 
the  important  information  that  influence  other  layers, 
exchange  the  information  between  layers  and  implement 
adaptive  protocols  and  algorithms  at  each  layer  to  opti¬ 
mize  the  performance. 

Lots  of  previous  works  have  focused  on  cross-layer 
design  for  QoS  provision.  Liu  [2]  combine  the  AMC  at 
physical  layer  and  ARQ  at  the  data  link  layer.  Ahn  [3] 
use  the  info  from  MAC  layer  to  do  rate  control  at  net¬ 
work  layer  for  supporting  real-time  and  best  effort  traffic. 
Akan  [4]  propose  a  new  adaptive  transport  layer  suite 
including  adaptive  transport  protocol  and  adaptive  rate 


control  protocol  based  on  the  lower  layer  information. 

Some  works  related  to  energy  efficiency  have  been 
reported.  Banbos  proposes  a  power-controlled  multiple 
access  schemes  in  [5].  This  protocol  reveals  the  trade¬ 
off  of  the  transmitter  power  cost  and  backlog/delay  cost 
in  power  control  schemes.  Zhu  [6]  proposes  a  minimum 
energy  routing  scheme,  which  consider  the  energy  con¬ 
sumption  for  data  packets  as  well  as  control  packets  of 
routing  and  multiple  access.  In  [7],  Sichitiu  proposes 
a  cross-layer  scheduling  method.  Through  combining 
network  layer  and  MAC  layer,  a  deterministic,  schedule- 
based  energy  conservation  scheme  is  proposed.  This 
scheme  drives  its  power  efficiency  from  eliminating  idle 
listening  and  collisions. 

However,  cross-layer  design  can  produce  unintended 
interactions  among  protocols,  such  as  an  adaptation 
loops.  It  is  hard  to  characterize  the  interaction  at  different 
layers  and  joint  optimization  across  layers  may  lead  to 
complex  algorithm. 

Our  algorithm  is  quite  different  from  all  the  previous 
works.  We  propose  to  use  the  Fuzzy  Logic  System  (FLS) 
in  the  cross-layer  design.  We  define  a  coherent  time,  a 
certain  period  of  time.  During  this  coherent  time,  the 
AMC  (Adaptive  Modulation  and  Coding),  transmission 
power,  retransmission  times  and  rate  control  decision 
are  used  for  packet  transmission.  After  this  time,  we 
adaptively  adjust  these  parameters  by  FLS  again  basing 
on  current  ground  speed,  average  delay  and  the  pack¬ 
ets  successful  transmission  ratio.  By  applying  the  FLS 
mechanism  to  the  cross-layer,  a  better  QOS  provision 
and  energy  efficiency  are  achieved. 

The  remainder  of  this  paper  is  structured  as  following. 
In  section  II,  we  introduce  the  preliminaries.  In  sec¬ 
tion  III,  we  make  a  overview  of  fuzzy  logic  systems. 
In  section  IV,  we  apply  the  FLS  into  the  cross-layer 
design.  Simulation  results  and  discussions  are  presented 


in  section  V.  In  section  VI,  we  conclude  the  paper. 

II.  Preliminaries 
A,  IEEE  802.11a  OFDM  PHY 

The  physical  layer  is  the  interface  between  the  wire¬ 
less  medium  and  the  MAC  [8].  The  principle  of  OFDM 
is  to  divide  a  high-speed  binary  signal  to  be  transmitted 
over  a  number  of  low  data-rate  subcarriers.  A  key 
feature  of  the  IEEE  802.11a  PHY  is  to  provide  8  PHY 
modes  with  different  modulation  schemes  and  coding 
rates,  making  the  idea  of  link  adaptation  feasible  and 
important,  as  listed  in  Table  I.  BPSK,  QPSK,  16-QAM 
and  64-QAM  are  the  supported  modulation  schemes. 
The  OFDM  provides  a  data  transmission  rates  from  6 
to  54MBPS.  The  higher  code  rates  of  2/3  and  3/4  are 
obtained  by  puncturing  the  original  rate  1/2  code. 

TABLE  I 

Eight  PHY  Modes  of  the  IEEE802. 1 1 A  PHY 


Mode 

Modulation 

CodeRate 

DataRate 

1 

BPSK 

1/2 

6Mbps 

1  3  1 

2 

BPSK 

3/4 

9Mbps 

Kill 

3 

QPSK 

1/2 

12Mbps 

6 

4 

QPSK 

3/4 

18Mbps 

9 

5 

16  -  QAM 

1/2 

24  Mbps 

12 

6 

16  -  QAM 

3/4 

36  Mbps 

18 

7 

64  -  QAM 

2/3 

48Mbps 

24 

8 

64  -  QAM 

3/4 

64Mbps 

mku 

B.  IEEE  802.11  MAC 

The  802.11  MAC  uses  Carrier-Sense  Multiple  Access 
with  Collision  Avoidance  (CSMA/CA)  to  achieve  au¬ 
tomatic  medium  sharing  between  compatible  stations. 
In  CSMA/CA,  a  station  senses  the  wireless  medium  to 
determine  if  it  is  idle  before  it  starts  transmission.  If 
the  medium  appears  to  be  idle,  the  transmission  may 
proceed,  else  the .  station  will  wait  until  the  end  of 
the  in-progress  transmission.  A  station  will  ensure  that 
the  medium  has  been  idle  for  the  specified  inter-frame 
interval  before  attempting  to  transmit. 

Besides  carrier  sense  and  RTS/CTS  mechanism,  an 
acknowledgment  (ACK)  frame  will  be  sent  by  the  re¬ 
ceiver  upon  successful  reception  of  a  data  frame.  Only 
after  receiving  an  ACK  frame  correctly,  the  transmitter 
assumes  successful  delivery  of  the  corresponding  data 
frame.  The  sequence  for  a  data  transmission  is:  RTS- 
CTS-DATA-ACK. 

A  mobile  node  will  retransmit  the  data  packet  when 
finding  failing  transmission.  Retransmission  of  a  signal 
packet  can  achieve  a  certain  probability  of  delivery. 


There  is  a  relationship  between  the  probability  of  de¬ 
livery  p  and  retransmission  times  n: 

n  =  1.451n — - —  (1) 

1  ~P 

The  IEEE  802.11  standard  requires  that  the  transmit¬ 
ter’s  MAC  discard  a  data  frame  after  certain  number 
of  unsuccessful  transmission  attempts.  According  to  the 
requirement  of  probability  of  delivery,  we  choose  the 
minimum  number  of  retransmission.  The  advantage  is 
we  can  save  energy  through  avoiding  unnecessary  re¬ 
transmission,  and  ensure  probability  of  delivery. 

C.  Application  Layer 

Traffic  in  application  layer  is  divides  into  two  classes: 
real-time  and  best-effort.  Each  node  in  the  mobile  ad 
hoc  networks  independently  regulates  best  effort  traffic. 
It  is  proposed  to  control  the  rate  of  the  best-effort  traffic 
to  avoid  excessive  delays  of  the  real-time  traffic  by 
using  local  per-hop  delays  as  a  feedback  to  local  rate 
controller  [3].  The  general  behavior  of  a  congestion- 
controlled  system  is  illustrated  in  Fig.l.  The  control 
algorithm  ensures  that  the  system  operates  around,  or 
preferably  close  to  the  ’’cliff”,  which  ensure  maximum 
system  throughput,  but  at  the  cost  of  large  average 
packets  delay.  The  control  algorithm  discussed,  one  the 
other  hand,  keep  the  system  at  the  delay  ’’knee”  where 
the  system  throughput  is  almost  the  same  as  the  at  the 
cliff,  but  the  buffers  are  significantly  less  loaded,  so 
the  delay  is  close  the  minimum.  Due  to  loss  typically 
happens  at  the  cliff,  while  delays  start  to  increase  at  the 
knee,  we  use  the  per-hop  MAC  delay  as  a  feedback  for 
local  control  instead  of  the  packet  loss. 


Fig.  1 .  General  Behavior  of  a  Congestion-controlled  System 

When  MAC  layer  acquires  access  to  the  channel,  the 
nodes  will  exchange  the  RTS-CTS-DATA-ACK  packets. 
After  the  transmitters  receive  an  ACK  packet,  a  packet  is 
transmitted  successfully.  The  packet  delay  represents  the 
time  it  took  to  send  the  packet  between  the  transmitter 


and  the  next-hop  receiver,  including  the  deferred  time 
and  the  time  to  fully  acknowledge  the  packet.  In  this 
paper,  we  assume  that  there  will  be  always  best-effort 
traffic  present  that  can  be  locally  and  rapidly  rate  con¬ 
trolled  in  an  independent  manner  at  each  node  to  yield 
necessary  low  delays  and  stable  throughputs. 

D.  Energy 

A  mobile  node  consumes  significant  energy  when  it 
transmits  or  receives  a  packet.  But  we  will  not  consider 
the  energy  consumed  when  the  mobile  node  is  idle. 

The  distance  between  two  nodes  are  variable  in  the 
mobile  ad  hoc  networks  and  the  power  loss  model  is 
used.  To  send  the  packet,  the  sender  consumes  [9], 

Ptx  =  P dec  4"  (js  •  d  (2) 

and  to  receive  the  packet,  the  receiver  consumes, 

Prx  =  P dec  (3) 

where  Peiec  represents  the  power  that  is  necessary 
for  digital  processing,  modulation,  and  ejs  represents 
the  power  dissipated  in  the  amplifier  for  the  free  space 
distance  d  transmission. 

A  joint  characteristic  of  most  application  scenarios 
of  mobile  ad  hoc  networks  is  that  mobile  nodes  only 
have  a  limited  energy  supply  which  might  not  even  be 
rechargeable,  hence  they  have  to  be  energy-efficient  as 
possible.  Transmitter  power  control  allows  interfering 
communication  links  sharing  the  same  channel  to  achieve 
their  required  QoS  levels,  minimizing  the  needed  power, 
mitigating  the  channel  interference,  and  maximizing  the 
network  user/link  capacity. 

E.  Delay 

The  packet  transmission  delay  between  the  mobile 
nodes  includes  three  parts:  the  wireless  channel  transmis¬ 
sion  delay,  the  Physical/MAC  layer  transmission  delay, 
and  the  queuing  delay  [10]. 

Defining  D  as  the  distance  between  two  nodes  and 
C  as  the  light  speed,  the  wireless  channel  transmission 
delay  as: 

D 

Delaych  =  —  (4) 

The  Physical/MAC  layer  transmission  delay  will  be 
decided  by  interaction  of  the  transmitter  and  the  receive 
channel,  the  node  density  and  the  node  traffic  intensity 
etc. 

The  queuing  delay  is  decided  by  the  mobile  node  I/O 
system-processing  rate,  the  subqueue  length  in  the  node. 


In  order  to  make  the  system  “stable”,  the  rate  at  which 
node  transfers  packets  intended  for  its  destination  must 
satisfy  all  nodes  that  the  queuing  lengths  will  not  be 
infinite  and  the  average  delays  will  be  bounded. 

F.  Node  Mobility  and  Channel  Fading 

Mobility  of  a  mobile  node  generates  a  doppler  shift, 
which  is  a  key  parameter  of  fading  channel.  The  doppler 
shift  is 

fd  =  -fc  (5) 

c 

where  v  is  the  ground  speed  of  a  mobile  node,  c  is  the 
speed  of  light  (3  x  108m/s),  and  fc  is  the  carrier.  In  our 
simulation,  we  used  the  carrier  is  6GHz.  For  reference, 
if  a  node  moves  with  speed  10 m/s,  the  doppler  shift  is 
200 Hz. 

We  model  channel  fading  in  ad  hoc  networks  as  Rician 
fading.  Rician  fading  occurs  when  there  is  a  strong 
specular  (direct  path  or  line  of  sight  component)  signal 
in  addition  to  the  scatter  (multipath)  components.  For 
example,  in  communication  between  two  infraed  sensors, 
there  exist  a  direct  path.  The  channel  gain, 

9(t)  =  9i(t)  +J9Q(t)  (6) 

can  be  treated  as  a  wide-sense  stationary  complex  Gaus¬ 
sian  random  process,  and  gj{t)  and  gQ{t)  are  Gaus¬ 
sian  random  processes  with  non-zero  means  m/(t)  and 
mQ(t),  respectively;  and  they  have  same  variance  cr2, 
then  the  magnitude  of  the  received  complex  envelop  has 
a  Rician  distribution, 

I  ,  X  .  X2  +  s2  _  .xs .  ^ 

Pa{x)  =  ^2  eXP{ - jM^)  X  ~  0 

where 

s2  =  m2j(t)  +  m2Q{t)  (8) 

and  /„(•)  is  the  zero  order  modified  Bessel  function. 
This  kind  of  channel  is  known  as  Rician  fading  channel. 
A  Rician  channel  is  characterized  by  two  parameters, 
Rician  factor  K  which  is  the  ratio  of  the  direct  path 
power  to  that  of  the  multipath,  i.e.,  K  =  s2/2o 2,  and 
the  Doppler  spread  (or  single-sided  fading  bandwidth) 
fd-  We  simulate  the  Rician  fading  using  a  direct  path 
added  by  a  Rayleigh  fading  generator.  The  Rayleigh  fade 
generator  is  based  on  Jakes’  model  [11]  in  which  an 
ensemble  of  sinusoidal  waveforms  are  added  together 
to  simulate  the  coherent  sum  of  scattered  rays  with 
Doppler  spread  fd  arriving  from  different  directions 
to  the  receiver.  The  amplitude  of  the  Rayleigh  fade 
generator  is  controlled  by  the  Rician  factor  K. 


fvzzy  logic  systtm 


BPSK,  QPSK,  16-QAM  and  64-QAM  are  the  sup¬ 
ported  modulation  schemes  for  IEEE  802.11a  OFDM 
physical  layer.  We  can  show  their  performance  curves 
with  Rician  fading  in  Fig.  2. 


Fig.  2.  Modulation  Curves  with  Rician  Fading 

After  we  introduce  the  channel  coding  and  node 
mobility  into  the  modulation  schemes,  the  mudualtion 
curves  will  change  a  lot.  For  the  same  SNR,  channel 
coding  will  improve  the  BER  performance  and  the 
mobility  will  degrade  the  BER  performance. 

G.  One-step  Markov  Path  Model 

The  mobile  nodes  are  roaming  independently  with 
variable  ground  speed.  The  mobility  model  is  called  one- 
step  Markov  path  model  [12].  The  probability  of  moving 
in  the  same  direction  as  the  previous  move  is  higher 
than  other  directions  in  this  model,  which  means  this 
model  has  memory.  Fig.3  shows  the  probability  of  the 
six  directions. 


Fig.  3.  One-step  Markov  Path  Model 


III.  Overview  of  Fuzzy  Logic  Systems 

Figure  4  shows  the  structure  of  a  fuzzy  logic  system 
(FLS). 


Fig.  4.  The  structure  of  a  fuzzy  logic  system 


When  an  input  is  applied  to  a  FLS,  the  inference 
engine  computes  the  output  set  corresponding  to  each 
rule.  The  defuzzifer  then  computes  a  crisp  output  from 
these  rule  output  sets  [13].  Consider  a  p-input  1-output 
FLS,  using  singleton  fuzzification,  center-of-sets  defuzzi¬ 
fication  [14]  and  “IF-THEN”  rules  of  the  form  [15] 

Rl  :  IF  x\  is  Fj  and  x-i  is  F?,  and  •  •  •  and  xp  is  ¥lp, 
THEN  y  is  G*. 


Assuming  singleton  fuzzification,  when  an  input  x'  = 
{x'x , . . . ,  x'p }  is  applied,  the  degree  of  firing  correspond¬ 
ing  to  the  Zth  rule  is  computed  as 


Pf[  Oi)  *  /%'  (4)  * '  •  •  *  Mf'  (x'p)  =  7?  iMf(  (x'i)  (9) 

where  *  and  T  both  indicate  the  chosen  f-norm.  There 
are  many  kinds  of  defuzzifiers.  In  this  paper,  we  focus, 
for  illustrative  purposes,  on  the  center-of-sets  defuzzifier. 
It  computes  a  crisp  output  for  the  FLS  by  first  computing 
the  centroid,  Cqi,  of  every  consequent  set  G;,  and,  then 
computing  a  weighted  average  of  these  centroids.  The 
weight  corresponding  to  the  Zth  rule  consequent  centroid 
is  the  degree  of  firing  associated  with  the  Zth  rule, 
VLiPF‘(x'i),  so  that 


J/cos(x  )  M  ~p  /  ,\ 

E/= 1  ^I=lMFl(a:i) 

where  M  is  the  number  of  rules  in  the  FLS. 


(10) 


IV.  Fuzzy  Application  For  Cross-layer  Design 

AMC,  transmission  power,  retransmission  times  and 
rate  control  decision  will  manage  the  energy  consump¬ 
tion  and  QoS  provision.  How  to  choose  a  proper  ad¬ 
justing  factor  for  these  parameters  will  determine  the 
wireless  ad  hoc  networks  performance. 

We  collect  the  knowledge  for  adjusting  factor  selection 
based  on  the  following  three  antecedents: 

1)  Antecedent  1.  Ground  speed. 

2)  Antecedent  2.  Average  delay. 

3)  Antecedent  2.  Packets  successful  transmission  ra¬ 
tio. 


The  linguistic  variables  used  to  represent  the  Ground 
speed,  average  delay  and  packets  successful  transmission 
ratio  were  divided  into  three  levels:  low,  moderate,  and 
high.  The  consequents  -  the  adjusting  factor  for  the 
AMC,  transmission  power,  retransmission  times  and  rate 
control  decision  were  divided  into  9  levels,  decrease  one, 
decrease  two,  decrease  three,  decrease  four,  unchanged, 
increase  one,  increase  two,  increase  three  and  increase 
four.  Fig.5  show  the  FLS  application  for  the  cross-layer 
design. 


Fig.  5.  Cross-layer  Design  Algorithm 

We  designed  questions  such  as: 

IF  ground  speed  is  low,  average  delay  is  low  and 

packets  successful  transmission  ratio  is  high,  THEN 
the  adjusting  factor  is _ . 

So  we  need  to  set  up  33  =  27  (because  every 
antecedent  has  3  fuzzy  sub-sets,  and  there  are  3  an¬ 
tecedents)  rules  for  this  FLS.  We  summarized  these  rules 
in  Table  II. 

We  used  trapezoidal  membership  functions  (MFs)  to 
represent  low,  high,  increase  four  and  decrease  four,  and 
triangle  MFs  to  represent  moderate,  unchange,  increase 
one  increase  two,  increase  three,  decrease  one,  decrease 
two  and  decrease  three.  We  show  these  MFs  in  Fig.6 
and  Fig.7. 

In  our  approach  to  form  a  rule  base,  we  chose  a  single 
consequent  for  each  rule.  We  design  a  fuzzy  logic  system 
using  rules  such  as: 

Rl  :  IF  ground  speed  (xQ  is  Ff, average  delay  fa)  is 

Ff,  and  packets  successful  transmission  ratio  fa)  is 
F f,  THEN  the  adjusting  factor  (y)  is  cl. 

For  every  input  (xi,  X2,  £3),  the  output  is  computed 


Fig.  6.  MFs  for  antecedents 


Fig.  7.  MFs  for  conseqents 


using 

yfa,x2,x3) 


EEi  Mf l  (xi)fa  (szVf?  (x3  )clG 
£?Il  M F)  (&1  W?  (*2)^F?  (*3) 


We  apply  (11)  to  compute  the  adjusting  factors  and 
adjust  the  network  parameters  dynamically.  Comparing 
to  the  constant  parameters,  the  fuzzy  optimization  for 
cross-layer  design  can  meet  QoS  and  energy  require¬ 
ment. 


V.  Simulations 

We  implemented  the  simulation  model  using  the  OP- 
NET  modeler.  The  simulation  region  is  300x300  meters. 
There  were  12  mobile  nodes  in  the  simulation  model, 
and  the  nodes  were  roaming  independently  with  variable 
ground  speed  between  0  to  10  meters  per  second.  The 
mobility  model  was  called  one-step  Markov  path  model. 
The  movement  would  change  the  distance  between  mo¬ 
bile  nodes. 

1)  Average  Delay:  Because  data  communications  in 
the  mobile  networks  had  trimming  constraints,  it  was 
important  to  design  the  network  algorithm  to  meet  a  kind 
of  end-end  deadline  [16].  We  used  the  average  delay  to 
evaluate  the  network  performance. 

d  Etl  di  n2) 

u average  ~~  fc 

Each  packet  was  labeled  a  timestamp  when  the  source 
mobile  node  generated  it.  When  its  destination  mobile 
node  received  it,  the  time  interval  was  the  transmission 
delay. 


Fig.8  showed  the  delay  performance  of  the  constant 
parameters  and  the  one  after  cross-layer  optimization 
for  the  real  time  traffic,  the  best  effort  traffic  and  all 
the  traffic.  Cross-layer  optimization  made  a  tradeoff  for 
the  average  delay  between  the  real  time  traffic  and  the 
best  effort  traffic.  For  the  real  time  traffic,  the  cross-layer 
optimization  would  enlarge  about  0.6  seconds.  However 
for  the  best  effort  case,  the  cross-layer  optimization  could 
reduce  the  delay  by  up  to  90.53%.  For  the  all  traffic, 
the  cross-layer  optimization  could  reduce  the  delay  by 
up  to  71.85%,  which  meant  the  cross-layer  optimization 
could  improve  the  average  delay  performance  for  the 
whole  system.  As  showed  in  the  best  effort  case,  the 
cross-layer  optimization  could  make  the  average  delay 
’’stable”,  which  was  important  for  the  communication 
system  design. 


Fig.  8.  Average  Delay 

2)  Energy  Efficiency:  It  was  not  convenient  to 
recharge  the  battery,  so  the  energy  efficiency  was  ex¬ 
tremely  important  for  mobile  ad  hoc  networks.  The 
network  should  keep  an  enough  number  of  “live”  mobile 
nodes  to  collect  data,  that  meant  the  network  need  to 
keep  the  energy  among  the  mobile  nodes  in  balance. 
We  used  the  number  of  remaining  alive  nodes  as  the 
parameter  of  the  energy  efficiency. 

In  (2)  and  (3),  we  assumed  Peiec  was  equal  to 
6.0 xl0~4  and  e/s  was  equal  to  6.0  xlO-4.  We  assumed 
that  the  energy  of  each  mobile  node  was  0.07  J. 

When  the  remaining  energy  of  a  mobile  node  was 
lower  than  a  certain  threshold,  the  node  was  considered 
as  “dead”.  In  this  simulation,  we  chose  1.2xl0-3  as 
the  threshold.  A  sensor  was  “dead”  meant  it  could 
not  transmit/receive  packets  any  longer,  so  it  would  be 
ignored  by  network.  The  number  of  nodes  of  mobile  ad 
hoc  networks  which  was  below  a  certain  threshold  meant 
this  network  does  not  work. 


As  Fig. 9  showed,  after  fuzzy  optimization,  the  dura¬ 
tion  of  the  first  node  “dead”  is  1 .67  times  longer  than 
that  of  the  constant  parameters,  which  is  1589  seconds. 


Fig.  9.  Node  Alive 

3)  Networks  Efficiency:  The  mobile  ad  hoc  networks 
were  used  to  collect  data  and  transfer  packets.  The 
throughput  of  packets  transmitted  was  one  of  the  pa¬ 
rameters  to  evaluate  the  networks  efficiency.  In  our 
simulation,  we  assumed  the  collecting  data  distribution 
of  the  mobile  node  was  Poisson  distribution  and  the 
arriving  interval  was  0.2  second.  Observing  from  Fig.  10, 
the  cross-layer  optimization  made  a  tradeoff  between 
the  real  time  traffic  and  the  best  effort  traffic.  For  the 
real  time  traffic,  after  the  cross-layer  optimization,  the 
throughput  of  the  network  was  about  0.02%  smaller 
than  that  of  the  constant  parameters.  However,  for  the 
best  effort  traffic,  the  throughput  of  the  network  was  up 
to  71.99%  larger.  For  the  all  the  traffic  case,  after  the 
cross-layer  optimization,  the  throughput  of  the  network 
was  up  to  32.52%  larger,  which  meant  the  cross-layer 
optimization  could  improve  the  throughput  performance 
for  the  whole  system.  As  the  performance  of  the  aver¬ 
age  delay,  the  cross-layer  optimization  could  achieve  a 
’’stable”  throughput  performance. 

We  introduced  the  fuzzy  logic  system  in  the  cross¬ 
layer  design.  Comparing  with  other  algorithms  for  cross¬ 
layer  design,  the  frizzy  method  could  be  flexible  and 
simpler  to  implement  and  the  performance  outputs  were 
also  impressive. 

VI.  Conclusion 

Cross-layer  design  is  a  effective  method  to  improve 
the  performance  of  the  mobile  ad  hoc  network.  We  apply 
the  fuzzy  logic  system  to  combine  physical  layer,  data- 
link  layer  and  application  layer  together.  We  selected 
ground  speed,  average  delay  and  packets  transmission 
successful  ratio  as  antecedents.  The  output  of  FLS 


Fig.  10.  Throughput 


provides  adjusting  factors  for  the  AMC,  transmission 
power,  retransmission  times  and  rate  control  decision. 
Simulation  shows  the  FLS  application  in  cross-lay  design 
could  reduce  the  average  delay,  increase  the  throughput 
and  extend  the  network  lifetime.  After  the  cross-layer 
optimization,  the  network  performance  parameters  could 
also  keep  stable.  In  the  future,  we  can  consider  other 
layers,  such  as  network  layer  for  the  cross-layer  design. 
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TABLE  II 


The  fuzzy  rules  for  cross-layer  design 

Antecedent  1  is  its  ground  speed,  Antecedent  2  is  its  average  delay  and  Antecedent  3  is  its  packets  successful  transmission  ratio. 
Consequent  1  is  adjusting  factor  for  retransmission  times,  Consequent  2  is  adjusting  factor  for  AMC,  Consequent  3  is  adjusting  factor  for 
transmission  power  and  Consequent  4  is  adjusting  factor  for  rate  control  decision. 


Rule  # 

Antecedent  1 

Antecedent  2 

Antecedent  3 

Consequent  1 

Consequent  2 

Consequent  3 

Consequent  4 

1 

low 

low 

low 

decrease  two 

unchange 

unchanged 

2 

low 

low 

moderate 

unchanged 

unchanged 

decrease  two 

decrease  two 

3 

low 

low 

high 

decrease  two 

decrease  four 

decrease  four 

4 

low 

moderate 

low 

decrease  one 

5 

low 

moderate 

moderate 

decrease  one 

decrease  one 

decrease  one 

6 

low 

moderate 

high 

decrease  three 

increase  three 

decrease  three 

decrease  three 

7 

low 

high 

low 

unchanged 

unchanged 

8 

low 

high 

moderate 

decrease  two 

unchanged 

unchanged 

9 

low 

high 

high 

decrease  four 

increase  four 

decrease  two 

decrease  two 

10 

moderate 

low 

low 

increase  three 

decrease  three 

11 

moderate 

low 

decrease  one 

decrease  one 

decrease  one 

12 

moderate 

low 

high 

increase  one 

decrease  three 

decrease  three 

13 

moderate 

moderate 

low 

decrease  two 

14 

moderate 

moderate 

moderate 

unchanged 

unchanged 

unchanged 

unchanged 

15 

moderate 

moderate 

high 

decrease  two 

decrease  two 

decrease  two 

16 

moderate 

high 

low 

J2SE223E23H 

decrease  one 

increase  three 

increase  three 

17 

moderate 

high 

moderate 

decrease  one 

18 

moderate 

high 

high 

decrease  three 

increase  three 

decrease  one 

decrease  one 

19 

high 

low 

low 

mmmsm 

decrease  four 

increase  two 

20 

high 

low 

moderate 

decrease  two 

unchanged 

unchanged 

21 

high 

low 

high 

unchanged 

unchanged 

decrease  two 

decrease  two 

22 

high 

moderate 

low 

increase  three 

decrease  three 

increase  three 

increase  three 

23 

high 

moderate 

moderate 

decrease  one 

24 

high 

moderate 

high 

decrease  one 

decrease  one 

decrease  one 

25 

high 

high 

low 

increase  two 

decrease  two 

increase  four 

increase  four 

26 

high 

high 

moderate 

unchanged 

unchanged 

27 

high 

high 

high 

decrease  two 

unchanged 

unchanged 

